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ABSTRACT: 

The  nominal  purpose  of  this  study  was  to  examine  the  effects  of  stoploss  reinsurance  on  the 
comparative  performance  of  alternative  risk  adjustment  methods  used  to  establish  prospective 
payment  (capitation)  rates  for  health  care  services.  Models  and  standards  are  not  well  established 
that  define  how  the  influence  of  reinsurance,  or  any  seemingly  independent  component  of  the 
process  of  applying  risk  adjusted  rates  for  payment,  might  be  examined — particularly  in  terms  of 
statistical  effects.  Thus,  this  study  introduces  a  methodology  to  examine  the  broader  process  of 
applying  capitation  rates  in  the  context  of  principles  of  statistical  inference.  Since  capitation  rates 
are  essentially  sample  mean  values  from  some  larger  population  of  possible  values,  there  should  be 
some  identifiable  relationship  between  individual  and  group-level  measures  of  performance,  and 
that  relationship  should  involve  the  size  of  the  groups.  Failures  of  statistical  inference,  such  as 
marked  differences  between  individual  and  group-level  measures  of  predictive  accuracy,  can  be 
interpreted  as  evidence  of  bias  in  the  process  as  a  whole.  This  study  involved  a  split-half  analysis 
of  data  drawn  from  two  independent  practitioner  association  model  health  maintenance 
organizations.  Sampling  for  estimation  and  validation  subpopulations,  and  pseudo-group  practices 
of  various  sizes,  was  based  on  primary  care  provider  assignment  in  each  plan  to  introduce  a  modest 
degree  of  natural  selection.  Results  of  this  study  show  that  increasingly  lower  stoploss  levels  make 
differences  between  alternative  risk  adjustment  methods  more  distinct,  but  lower  levels  remove  the 
practical  effect  of  applying  risk  adjustment  on  group-level  measures  of  performance.  This  study 
also  demonstrates  that  using  the  log  of  underlying  cost  values  in  establishing  expectations  of  total 
health  service  costs  will  reduce  bias  in  those  expectations  that  is  unrelated  to  any  specific  risk 
adjustment  method.  Failure  to  address  such  bias  will  confound  the  comparison  of  risk  adjustment 
alternatives. 
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EXECUTIVE  SUMMARY 

The  study  presented  in  this  report  relies  on  an  understanding  that  the  purpose  of  health  insurance  is 
to  transfer  some  or  all  of  the  financial  risk  associated  with  seeking  medical  care  from  an  individual 
to  an  insurer.  That  agreement  is  feasible  because  the  insurer  is  able  to  pool  the  risks  of  a  large 
number  of  individuals  and  to  predict,  for  the  group  as  a  whole,  the  probable  costs  associated  with 
covered  services  during  a  given  period.  While  insurance  is  intended  to  provide  a  means  to  share 
financial  risk,  the  current  health  care  system  has  evolved  such  that  insurers  are  driven  to  reduce 
their  accountable  risk.  The  rapid  rise  in  health  care  costs  in  particular,  driven  in  large  part  by 
movements  that  began  during  the  1950s  and  1960s — toward  employer  sponsored  health  benefits, 
comprehensive  benefit  packages,  advances  in  medical  technology,  and  increasing  coverage  of  the 
aged  under  the  federal  Medicare  program — has  served  to  undermine  what  was  once  the 
communitywide  basis  for  the  payment  of  health  care  services. 

One  consequence  of  that  changing  environment  is  that  most  commercial  insurers  now  tend  to  focus 
on  administering  the  needs  of  large  self-insured  employers  to  the  exclusion  of  small  employers  and 
individuals.  Increasing  market  segmentation — where  attractive  risks  are  pooled  and  poorer  risks 
lack  organized  purchasing  power — has  made  community-rated  health  care  products  less  available 
and  affordable.  Now,  those  with  known  chronic  illness  can  be  excluded  from  coverage  and  face 
serious  financial  impediments  to  receiving  care.  Those  who  approach  the  insurance  market  as 
individuals  or  in  small  groups  can  face  very  high  insurance  premium  rates  that,  effectively,  preclude 
their  coverage.  Uninsured  or  underinsured  individuals  may  forgo  purchasing  necessary  care,  and 
run  the  risk  of  financial  ruin,  when  confronted  with  high  health  care  costs.  As  insurers  increasingly 
base  premiums  on  individual  differences,  insurance  acts  less  to  spread  risk  over  a  pool  of 
beneficiaries  than  as  a  prepayment  mechanism  for  health  care  costs  (GAO  1989,  GAO  1991,  GAO 
1992,  Luft  1995). 

Another  consequence  of  the  changing  environment  is  that  providers  are  now  asked  to  play  an 
increasingly  important  role  in  the  distribution  of  financial  risk.  Today,  payment  systems  are  in 
place  that  shift  accountable  risk  from  insurers  to  provider  plans,  most  notably  through  capitated 
payment  to  health  maintenance  organizations  (HMOs).  These  arrangements  may  simply 
exacerbate  an  already  deteriorating  health  insurance  market  if  providers  respond  by  trying  to  avoid 
high  insurance  risks,  or  by  otherwise  withholding  services  for  economic  reasons. 

The  uncertainty  of  coverage  for  individuals  and  small  groups  in  the  face  of  rising  costs,  and  the 
changing  locus  of  financial  risk  in  a  segmented  market  tend  to  undermine  the  financial  risk 
protection  that  traditional  insurance  is  intended  to  provide — at  least  for  those  who  represent  less 
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attractive  risks.  In  response,  a  substantial  amount  of  research  has  been  devoted  to  the  development 
of  methods  to  account  (or  adjust)  for  the  relative  financial  risk  any  given  health  plan  or  provider 
assumes  when  it  (they)  enroll  members  on  a  fixed-payment  basis,  such  as  community  rating  or 
capitation.  While  a  variety  of  risk  adjustment  methodologies  exists,  no  method  has  been  identified 
as  clearly  preferable,  particularly  in  terms  of  predictive  accuracy  and  administrative  feasibility,  to 
implement  on  a  systemwide  basis  (AAA  1993,  Gauthier  et  al.  1995,  NHPF  1994). 

One  point  of  consensus  that  has  emerged  from  recent  health  reform  debate  is  that  approaches  to 
increase  health  insurance  coverage  and  contain  costs  should  build  on  the  nation's  current  insurance 
system  (Gauthier  et  al.  1995).  Reinsurance,  in  particular,  has  been  proposed  as  an  interim  measure 
to  moderate  the  potential  effects  of  biased  selection  into  health  plans  as  part  of  both  state  and 
federal  health  system  reform  (NHPF  1994,  PPRC  1994,  AAA  1993,  WHTF  1993,  Bovbjerg  1992, 
GAO  1992,  Schramm  1992).  Reinsurance  is  a  formal  device  to  shift  (or  cede)  some  portion  of  an 
insurance  risk  from  a  primary  carrier  to  a  reinsurer.  It  is  typically  used  in  health  care  financing  to 
control  the  effects  of  "outlier"  (high  cost)  cases  (Bovbjerg  1992).    It  is  also  commonly  required  in 
HMOs  as  a  condition  of  state  licensure  (Ward  1993). 

Research  into  how  reinsurance  might  be  used  in  combination  with  emerging  risk  adjustment 
methods  to  moderate  risk  to  providers  who  accept  capitation  payment  rates  is  limited.  Thus,  the 
nominal  goal  of  this  study  was  to  get  a  better  understanding  of  the  relationship  between  risk 
limitation  associated  with  reinsurance  and  risk  adjustment  that  is  used  to  account  for  differences  in 
the  distribution  of  health  service  costs.  That  relationship  was  examined  primarily  in  terms  of 
statistical  effects  because  reinsurance  effectively  modifies  the  variation  in  underlying  costs  that  are 
used  to  estimate  future  expenditures.  More  generally,  this  study  was  designed  to  examine 
reinsurance  in  the  context  of  what  Luft  (1995)  and  others  (Bowen  1995,  Gauthier  1995)  have 
suggested  is  a  need  to  shift  the  focus  from  a  search  for  the  perfect  risk  adjustment  model  to  the 
process  of  risk  adjustment  used  to  pay  for  health  care  services. 

The  Course  of  Analysis 

In  the  course  of  this  analysis  it  became  clear  that  the  key  to  understanding  the  effects  of 
reinsurance,  or  any  seemingly  independent  component  of  the  broader  process  of  applying  risk 
adjusted  payment  rates,  is  to  come  to  terms  with  the  nature  and  extent  of  bias  associated  with  each 
component  that  contributes  to  that  process  as  a  whole.  Bias,  in  this  context,  can  be  defined  as  some 
discrepancy  between  expected  and  actual  results  of  using  prospective  payment  rates. 
Consequently,  the  essential  focus  of  this  study  was  the  identification  of  bias  in  the  distribution  of 
health  service  payments. 
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Furthermore,  and  in  keeping  with  the  literature  on  risk  adjustment,  this  study  examined  both 
individual  and  group-level  measures  of  model  performance.  The  comparison  of  those  measures  in 
the  context  of  statistical  evidence  of  bias  made  it  apparent  that  the  perspective  of  analysis  is  an 
important  factor  in  understanding  the  effects  of  the  various  components  of  the  process  of  risk 
adjustment.  For  example,  the  individual-level  R2  associated  with  a  risk  adjustment  model  based  on 
actual  (untransformed)  dollars  is  shown  to  be  as  good  or  better  than  the  R2  for  the  same  model 
based  on  a  more  complex  data  treatment  used  to  address  bias  in  the  underlying  distribution  of 
health  service  costs.  Yet,  the  expectations  derived  from  the  more  complex  model  are  consistently 
better  estimates  of  future  costs,  in  terms  of  bias,  when  measured  at  the  group  level.  The  results  of 
this  study  illustrate  that  group-level  measures  provide  a  means  to  clearly  demonstrate  the  extent  to 
which  alternative  risk  adjustment  methods  address  the  bias  associated  with  the  distribution  of 
health  service  costs,  as  well  as  the  sum  of  the  bias  inherent  in  the  application  of  the  risk  adjustment 
methods.  In  other  words,  group-level  measures  provide  a  means  to  identify  components  of  bias 
that  individual-level  measures  will  not  readily  reveal. 

Methods  and  standards  for  making  group-level  assessments  in  the  application  of  risk  adjustment 
methods  are  not  well  established — in  practice  or  in  the  literature.  One  contribution  of  the  analysis 
in  this  report  is  that  it  introduces  the  use  of  a  technique  for  assessing  the  application  of  risk 
adjustment  methods  at  the  group  level  within  the  context  of  data  truncation  (associated  with 
stoploss  reinsurance  in  this  study)  and  the  relative  size  of  the  groups  (pseudo-group  practices  in 
this  study).  This  technique  draws  on  basic  principles  of  statistical  inference  regarding  variance, 
sample  size,  and  the  estimation  of  means  to  identify  the  existence  of  bias  in  risk  adjusted 
expectations.  The  basic  premise  for  this  type  of  analysis  is  that,  since  expectations  are  essentially 
sample  mean  values  from  some  larger  population  of  possible  values,  there  should  be  some 
identifiable  relationship  between  individual  and  group-level  results,  and  that  relationship  should 
involve  the  size  of  the  groups.  The  patterns  of  group-level  measures  of  predictive  accuracy, 
particularly  across  truncation  levels  and  groups  of  different  relative  sizes,  should  indicate  the  level 
of  untreated  (or  remaining)  bias  in  the  risk  adjusted  process  as  a  whole. 

To  support  this  analysis,  data  were  drawn  from  two  geographically  distinct,  but  otherwise 
comparable,  independent  practitioner  association  model  HMOs.  Both  plans  were  owned  by  the 
same  national  management  company.  They  each  served  working  populations  with  no  formal 
Medicare  or  Medicaid  enrollment  and  used  the  same  data  system  structure  to  record  service  claims 
and  other  administrative  information. 

Risk  adjustment  methodologies  included  in  this  simulation  were  based  on  age  and  gender  alone,  age 
and  gender  plus  a  flag  for  the  presence  of  any  chronic  condition,  Ambulatory  Care  Groups  (ACGs), 
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and  Ambulatory  Diagnostic  Groups  (ADGs).  Reinsurance  was  modeled  at  four  stoploss 
thresholds  ($50,000,  $25,000,  $10,000,  and  $5,000)  with  a  coinsurance  rate.  That  modeling 
involved  truncating  charges  for  any  given  individual  in  the  study  at  the  stoploss  threshold,  and  then 
reapplying  10  percent  of  the  truncated  costs  to  that  individual. 

Cost  expectations  (risk  adjusted  payment  rates)  for  total  health  service  costs  were  derived  using 
ordinary  least  squares  regression  of  the  actual-dollar  form  of  costs,  as  well  as  the  log  of  those 
amounts.  Risk-class-specific  parameter  estimates  were  calculated  using  an  estimation  sample  from 
each  of  the  two  study  sites.  Those  estimates  were  then  used  to  establish  expected  health  service 
costs  for  separate  validation  samples  in  each  respective  site.  Sampling  used  to  establish  estimation 
and  validation  subpopulations,  and  pseudo-groups  of  plan  enrollees  of  various  sizes  (500,  1500, 
3000,  and  5000)  to  support  group-level  analyses,  was  based  on  primary  care  provider  (PCP) 
assignment  in  each  plan.  In  contrast  to  strict  random  sampling  of  individuals,  which  minimizes 
selection  differences  in  study  populations,  the  availability  of  PCP  assignment  made  it  possible  to 
introduce  a  modest  level  of  natural  selection  to  the  analysis. 

Summary  of  Results 

With  respect  to  the  nominal  focus  of  this  study,  individual-level  measures  of  performance  improve 
for  any  given  risk  adjustment  method  with  successively  lower  stoploss  reinsurance  levels.  At  the 
same  time,  differences  between  risk  adjustment  alternatives  on  those  measures  widen  at  lower 
levels.  In  other  words,  increasingly  lower  truncation  levels  associated  with  reinsurance  thresholds 
make  differences  between  risk  adjustment  methods  more  distinct.  In  terms  of  group-level  measures 
such  as  the  percentage  of  groups  that  fall  within  5  percent  of  actual  costs  however,  there  are  no 
significant  differences  between  alternative  risk  adjustment  methods,  regardless  of  group  size,  at 
stoploss  levels  of  $10,000  or  less  in  this  study.  Thus,  lower  stoploss  levels  can  remove  the 
practical  effect  of  applying  risk  adjustment  in  setting  payment  rates. 

Traditional  insurance  theory  suggests  that  stoploss  is  intended  to  deal  with  outliers  that  are,  in  any 
other  sense,  unpredictable.  By  contrast,  risk  adjustment  is  used  to  account  for  identifiable  (and, 
therefore,  predictable)  differences  throughout  a  population.  Stoploss  reinsurance  can  be  used  to 
compliment  the  process  of  risk  adjustment  by  moderating  the  extreme  distribution  of  costs.  That 
may  make  the  costs  that  are  subject  to  risk  adjustment  more  predictable.  However,  from  a 
traditional  perspective  regarding  insurance,  it  would  be  inappropriate  to  use  stoploss  to  offset 
available  gains  in  the  explanatory  power  of  alternative  risk  adjustment  methods  because  those  gains 
would  be,  by  definition,  made  at  the  expense  of  information  that  could  be  used  to  explain 
differences  in  costs. 
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Other  highlights  from  the  analysis  include: 

►  In  terms  of  individual-level  measures,  alternative  risk  adjustment  methods  perform  in  a 
predictable  pattern — relative  to  each  other — across  alternative  performance  measures  and 
across  comparable  populations  (health  plans). 

+  Through  the  examination  of  group-level  measures,  this  study  illustrates,  and  reaffirms,  the 
earlier  work  of  Duan  et  al.  (1982)  regarding  the  treatment  of  health  service  cost  data.  That  work 
suggested  that  data  transformation  included  in  the  process  of  establishing  cost  expectations, 
such  as  using  the  log  of  underlying  cost  values,  would  reduce  bias  in  those  expectations  that  is 
unrelated  to  any  specific  risk  adjustment  method.  An  important  implication  of  this  result  is  that 
a  failure  to  address  problems  inherent  in  the  underlying  distribution  of  the  dependent  cost  data 
may  confound  the  comparison  of  risk  adjustment  alternatives  and,  more  importantly,  produce 
biased  expectations  of  costs.  The  practical  significance  of  these  results  is  a  function  of  the 
extent  to  which  selection  differences  actually  occur  in  the  distribution  of  health  plan  enrollees. 

►  Comparing  group-level  performance  measures  based  on  predictive  ratios  of  actual  to  expected 
costs  derived  using  untransformed-dollar  cost  values  with  the  same  measures  derived  using 
appropriate  data  transformation  can  provide  a  relative  measure  of  the  extent  of  selection 
differences  between  the  population  used  to  estimate  cost  expectations  and  the  target  population. 

►  In  the  absence  of  otherwise  confounding  factors,  such  as  bias  in  the  distribution  of  the 
underlying  data,  expectations  derived  using  ADGs  conform  as  well  as  might  be  expected, 
considering  basic  principles  of  statistical  inference,  in  explaining  the  costs  associated  with 
health  status  differences.  Mean  forecasting  bias  was  reduced  to  near  zero  (0)  in  groups  of 
approximately  5000  plan  enrollees.  ADGs  do  not  seem  to  contribute  substantial  inherent  bias 
to  underlying  expectations  for  the  two  health  plans  in  this  study.  By  contrast,  age  and  gender, 
together,  seemed  to  contribute  approximately  4  percent  to  mean  forecasting  bias  given 
expectations  based  on  those  factors  alone  (and  stoploss  at  $25,000),  regardless  of  the  size  of  the 
groups,  in  the  two  health  plans  in  this  study. 

►  A  common  practice,  in  simulations  such  as  this  study,  is  to  limit  the  influence  of  extreme  high- 
cost  individual  cases  by  removing  (truncating)  all  charges  above  some  predetermined  threshold. 
Simply  truncating  extreme  values  in  this  study  introduced,  rather  than  reduced,  bias  in 
underlying  expectations.  Thus,  the  process  of  truncating  costs  may  be  a  source  of  bias  in  risk 
adjusted  expectations  of  costs,  and  in  the  assessment  of  the  performance  of  alternative  risk 
adjustment  methods. 
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In  the  Final  Analysis 

There  is  a  sense  in  which  this  research  (and  the  accompanying  analysis)  is  simply  a  review  of  well- 
established  theory  from  several  disciplines.  An  actuary  would  easily  recognize  the  conclusions 
regarding  the  application  of  stoploss  in  the  process  of  risk  adjustment.  A  statistician  would 
recognize  the  implications  for  underlying  inferences  in  the  treatment  of  dependent  cost  values.  A 
health  service  researcher  would  accept  the  suggestion  that  risk  adjustment  might  be  used  to  discern 
variation  in  performance  beyond  that  attributable  to  health  status  alone,  since  that  is  its  essential 
purpose.  What  is  different  about  this  analysis  is  that  it  is  an  attempt  to  assimilate  aspects  of  those 
disciplines  to  examine  the  process  of  applying  risk  adjustment  for  payment  as  a  whole.  Moreover, 
the  results  of  this  study  suggest  that  existing  risk  adjustment  methods  may  already  be  adequate  to 
approach  the  broader  purpose  of  accounting  for  health  status  differences,  at  least  in  terms  of  bias  in 
the  distribution  of  costs,  and  that  techniques  for  group-level  analysis  exist  to  facilitate  the 
assessment  of  that  distribution.  Finally,  the  population  orientation  of  both  insurance  theory  and 
principles  of  statistical  inference  may  help  lay  the  foundation  for  understanding  why  it  is  important 
to  approach  risk  adjustment  as  a  process  affecting  populations,  rather  than  focus  on  risk  at  the 
individual  level,  as  a  means  to  control  health  care  costs  and,  by  extension,  improve  the  distribution 
of  (access  to)  health  care  services. 
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CHAPTER  1 
INTRODUCTION 

The  purpose  of  health  insurance  is  to  transfer  some  or  all  of  the  financial  risk  associated  with 
seeking  medical  care  from  an  individual  to  an  insurer.  Insurers  agree  to  pay  specific  losses  suffered 
by  the  insured  in  return  for  a  premium.  This  agreement  is  feasible  because  the  insurer  is  able  to 
pool  the  risks  of  a  large  number  of  individuals  and  to  predict,  for  the  group  as  a  whole,  the  probable 
costs  associated  with  covered  services  during  a  given  period  (Phelps  1992). 

While  insurance  is  intended  to  provide  a  means  to  share  financial  risk,  the  current  health  care 
system  has  evolved  such  that  both  insurers  and  providers  are  driven  to  reduce  their  accountable 
risk.  During  the  1950s  and  1960s,  a  movement  toward  comprehensive  benefit  packages,  advances 
in  medical  technology,  and  increasing  coverage  of  the  aged  under  the  federal  Medicare  program 
helped  stimulate  rapidly  rising  health  care  costs.  In  response,  insurers  began  to  target  specific 
markets  to  help  stabilize  their  premium  pricing  and  to  limit  their  risk.  By  the  early  1970s, 
premiums  for  large  employer  groups,  in  particular,  were  commonly  based  on  the  actual  health 
service  use  of  such  groups  through  a  process  of  "experience  rating."  As  employers  began  to  realize 
that  they  might  do  better  financially  to  assume  the  risk  themselves,  many  large  employers  began  to 
self-insure.  Thus,  pools  of  relatively  low-risk  employees  were  drawn  away  from  otherwise 
community-rated  markets  (HIAA  1995,  Newhouse  1993B,  Starr  1982). 

Commercial  insurers  now  tend  to  focus  on  administering  the  needs  of  self-insuring  large  employers 
to  the  exclusion  of  small  employers  and  individuals.  Underwriting  practices  in  small  group  and 
individual  markets  can  exclude  certain  individuals  based  on  pre-existing  conditions,  and  insurers 
can  refuse  to  renew  or  offer  coverage  to  certain  industries  and  groups  (Wrightson  1990,  Herrle 
1993).  Insurers  may  also  avoid  providing  benefits  and  coverage  options  that  they  find  attract  high 
users  (Jones  1995).  They  may  emphasize  others,  such  as  well  baby  care,  that  attract  low  users 
(Anderson  et  al.  1986B).  Increasing  market  segmentation — where  attractive  risks  are  pooled  and 
poorer  risks  lack  organized  purchasing  power — has  made  community-rated  health  care  products 
less  available  and  affordable  (GAO  1989,  GAO  1991,  GAO  1992,  Luft  1995). 

One  consequence  is  that  those  with  known  chronic  illness  can  be  excluded  from  coverage  and  face 
serious  financial  impediments  to  receiving  care.  Those  who  approach  the  insurance  market  as 
individuals  or  in  small  groups  can  face  burdensomely  high  insurance  premium  rates  that, 
effectively,  preclude  their  coverage.  Uninsured  or  underinsured  individuals  may  forgo  purchasing 
necessary  care,  and  run  the  risk  of  financial  ruin,  when  confronted  with  high  health  care  costs.  As 
insurers  increasingly  base  premiums  on  individual  differences,  insurance  acts  less  to  spread  risk 
over  a  pool  of  beneficiaries  than  as  a  prepayment  mechanism  for  health  care  costs  (GAO  1991). 


2 


Providers  also  play  an  increasingly  important  role  in  the  distribution  of  financial  risk.  Payment 
systems  are  now  in  place  that  shift  accountable  risk  from  insurers  to  providers  or  provider  plans. 
The  Medicare  Prospective  Payment  System,  for  example,  involves  payment  of  a  predetermined  rate 
for  hospital  services  associated  with  specific  diagnoses,  largely  without  regard  to  the  actual  costs 
incurred  by  providers  for  any  specific  case.  Capitated  payments  for  services  in  health  maintenance 
organizations  (HMOs)  are  another  notable  example  of  the  shift  in  financial  risk.  HMOs  provide  a 
defined  set  of  benefits  over  a  specified  period  of  time  in  return  for  a  set  premium,  or  capitation 
amount.  Capitation  is  referred  to  as  a  "risk-based"  payment  mechanism  (as  opposed  to  fee-for- 
service-based  reimbursement  under  traditional  indemnity  insurance)  because  it  involves  setting 
rates  for  a  package  of  covered  services  on  a  prospective  basis,  and  the  assumption  of  financial  risk 
in  providing  those  services,  regardless  of  actual  service  use  during  the  coverage  period.  HMOs, 
and  other  managed  care  systems  of  delivery,  offer  the  potential  to  control  costs  by  organizing 
providers  into  groups  or  networks  and  by  integrating  the  financing  and  delivery  of  care  (HIAA 
1995).  Stated  another  way,  providers  who  are  responsible  for  managing  the  delivery  of  care  are 
now  also — at  some  level — at  financial  risk  for  those  services. 

An  HMO  may  retain  all  of  its  associated  financial  risk  at  an  overall  organizational  level,  such  as  in 
a  staff-model  plan  where  individual  providers  are  salaried  or  paid  on  a  per-service  basis. 
Increasingly,  however,  that  risk  is  focused  more  narrowly  when  physicians  within  managed  care 
plans  are,  themselves,  reimbursed  on  a  capitated  basis.  More  than  20  percent  of  primary  care- 
based  payments  are  expected  to  be  through  some  form  of  capitation  by  the  year  2000  (Cave  1994). 
As  they  become  an  integral  part  of  the  financial  risk-bearing  arrangements  in  health  plans, 
providers  have  a  vested  interest  in  becoming  familiar  with  how  risks  are  selected  and  in  affecting 
the  underwriting  process  (Smith  1992). 

Payment  arrangements  that  shift  some  of  the  financial  risk  associated  with  providing  care  onto  the 
provider  may  simply  exacerbate  an  already  deteriorating  health  insurance  market.  In  the  absence  of 
some  countervailing  incentive,  individual,  or  groups  of,  physicians  who  accept  capitation  payments 
for  their  services  have  the  same  incentives  insurers  do  to  selectively  limit  the  risk  they  assume.  The 
realization  of  financial  risk  on  the  part  of  providers  could,  for  example,  affect  the  delivery  system 
to  discourage  enrollment  and  encourage  disenrollment  of  those  who  are  likely  to  use  services.  This 
type  of  risk  selection  may  have  serious  consequences  for  vulnerable  (high-risk)  populations.  In  the 
extreme,  providers  at  risk  for  some  portion  of  the  cost  of  care  associated  with  a  costly  chronic 
condition  may  delay  or  discourage  the  use  of  expensive  treatments  that  may  alleviate  some  aspect 
of  the  condition,  and  that  might  otherwise  be  provided  in  the  absence  of  that  risk.  Aside  from  the 
potential  that  they  may  not  receive  appropriate  care,  patients  who  perceive  providers  as 
unresponsive  to  their  needs  in  this  way  may  seek  out  alternative  providers  who  are  less  sensitive  to 
their  associated  risk.  Resultant  selection  effects  across  providers  may  also  affect  the  financial 
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viability  of  health  plans,  particularly  those  that  do  not  practice  risk  selection  as  effectively  as  their 
competitors  (Gauthier  et  al.  1995,  Newhouse  et  al.  1989,  van  Vliet  1992). 

The  uncertainty  of  coverage  for  individuals  and  small  groups  in  the  face  of  rising  costs,  and  the 
changing  locus  of  financial  risk  in  a  segmented  market  tend  to  undermine  the  financial  risk 
protection  that  traditional  insurance  is  intended  to  provide — at  least  for  those  who  represent  less 
attractive  risks.  In  response,  many  states  have  enacted  or  proposed  a  series  of  health  reforms  to 
create  a  more  competitive  insurance  market.  Those  include:  market  rule  changes,  such  as 
guaranteed  issue  and  renewal  of  coverage;  rating  reform,  particularly  in  the  form  of  community- 
rated  premiums;  risk  adjustment  to  moderate  adverse  selection  across  plans  in  the  face  of  other 
market  restrictions;  and  group  purchasing  incentives  for  small  groups  (GAO  1994A,  Luft  1995, 
PPRC  1995.)  The  prevailing  theme  of  those  efforts  is  that  market  incentives  can  be  used  to 
encourage  insurers  (and  health  plans)  to  compete  on  the  basis  of  efficiency  and  service,  rather  than 
on  their  ability  to  selectively  avoid  risk  (Bowen  1995,  Enthoven  and  Kronick  1989,  Enthoven  and 
Singer  1995,  Newhouse  1994,  PPRC  1995.) 

RISK  ADJUSTMENT 

Many  reform  proposals  envision  a  world  of  competing  health  plans  that  receive  a  fixed  premium 
per  person.  Risk  adjustment  used  to  establish  regulated  premium  rates  is  a  critical  component  of 
those  reforms,  to  offset  perverse  selection  incentives  and  to  adequately  compensate  providers  who 
would  otherwise  be  motivated  to  avoid  high  risks  to  maintain  a  competitive  edge  (AAA  1994B, 
GAO  1994A,  PPRC  1995,  WHTF  1993).  Risk  adjustment  is  a  means  to  account  for  the  relative 
financial  risk  any  given  health  plan  or  provider  assumes  when  it  (they)  enrolls  members  on  a  fixed- 
payment  basis,  such  as  community  rating  or  capitation.  Risk  adjustment  methods  reflect  person- 
specific  characteristics  that  are  used  to  help  control  for,  or  "explain",  the  underlying  variation  in 
resource  use  in  some  meaningful  way  that  is  related  to  how  those  resources  can  be  expected  to  be 
distributed  in  the  future.  If  payments  reflect  enough  of  that  underlying  variation,  insurers  and 
providers  who  assume  risk  will  be  more  likely  to  compete  based  on  price,  service,  and  quality  rather 
than  on  the  ability  to  segment  risk  (Bowen  1995). 

It  should  be  noted  that  the  goal  in  adjusting  for  the  financial  risk  associated  with  different  patient 
populations  is  not  to  explain  all  of  the  underlying  variation  in  costs.  If  100  percent  of  the  risk 
involved  in  providing  services  was  accounted  for  in  risk  adjustment,  no  incentive  would  remain  to 
provide  those  services  efficiently.  It  is  not  likely,  in  any  case,  that  more  than  a  limited  proportion 
of  that  variation  can  be  explained.  Commonly  cited  estimates  suggest  that  only  14  to  20  percent  of 
the  differences  in  annual  health  care  costs  among  individuals  can  be  predicted  using  any  likely  risk 
adjustment  formulation  (McCall  and  Wai  1983,  Newhouse  et  al.  1989,  Van  Vliet  1992,  Welch 
1985).  The  remaining  variation  is  generally  unpredictable  because  it  is  due  to  chance  or  otherwise 
unknowable  factors.  Thus,  as  a  practical  matter,  risk  adjustment  is  used  to  moderate  the  financial 
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risk  health  plans  assume  within  the  context  of  prevailing  methods  for  identifying  and  assessing  that 
risk. 

Relatively  simple  risk  adjustment  methods  based  on  demographic  criteria  have  been  used  in 
actuarial  analysis  for  decades.  Age  and  gender  are  commonly  accepted  adjustment  criteria  largely 
because  they  are  easily  identifiable,  they  are  predictable  over  time,  and  they  are  administratively 
easy  to  apply  (Browne  and  Doerpinghaus  1993,  Kongstvedt  1993,  Wrightson  1993).  Age  and 
gender,  along  with  institutional  and  welfare  statuses,  underlie  the  calculation  of  risk-based 
payments  to  HMOs  under  the  federal  Medicare  program  (Palsbo  1989).  Many  states  use  factors 
such  as  age,  family  size,  and  geography  to  adjust  community-rated  premiums  as  part  of  their  small 
group  market  reform  efforts  (PPRC  1995). 

The  basic  problem  with  relying  solely  on  demographic  or  sociodemographic  rating  factors  is  that 
they  do  not  reflect  health  differences  that  more  naturally  account  for  variation  in  resource  use 
across  health  plans  and  providers.  Rating  factors  that  reflect  very  heterogenous  levels  of  medical 
risk  provide  a  strong  incentive  to  encourage  biased  selection  into  and  out  of  health  plans.  Plans 
could,  for  example,  structure  benefits  or  focus  advertising  to  attract  enrollees  who  tend  to  be 
healthier  than  average,  such  as  young  families.  Plans  can  also  encourage  disenrollment  by  making 
access  difficult  for  those  who  need  services  (Epstein  and  Cumella  1988,  McClure  1984,  Van  de 
Ven  and  Schut  1994). 

The  limited  relationship  between  demographic  or  sociodemographic  factors  and  health  expenditures 
is  reflected  in  that  those  factors  tend  to  explain  only  a  small  portion  of  the  variation  in  resource  use 
at  the  individual  level.  The  demographic  factors  underlying  the  AAPCC  have  been  shown  to 
explain  between  0.6  and  1.0  percent  of  annual  expenditures  under  the  Medicare  program  (Anderson 
et  al.  1986A,  Lubitz  et  al.  1985).  Age  and  gender  explain  only  slightly  more  of  the  variation  in 
resource  use  among  non-Medicare  populations  (Dunn  et  al.  1995,  Fowles  et  al.  1994,  Weiner  et  al. 
1991,  1994).  Thus,  these  factors  provide  very  limited  information  with  which  to  moderate 
selection  incentives  in  health  plans.  Most  recent  research  in  this  area  has  focused  on  the 
development  of  a  practical,  reliable,  and  administratively  feasible  way  to  include  some  measure  of 
health  status  in  risk  adjustment  formulations  (Newhouse  1986,  Epstein  and  Cumella  1988,  Van  de 
VenandVanVliet  1992). 

The  development  and  assessment  of  risk  adjustment  methodologies  to  meet  the  needs  of  various 
reform  efforts  is  a  significant  ongoing  concern  within  health  services  research.  The  problem — for 
policy  makers  and,  more  directly,  for  the  health  plans  that  will  assume  risk  under  such  reforms — is 
that  there  are  important  technical  issues  that  need  to  be  resolved.  While  a  variety  of  risk 
adjustment  methodologies  exists,  no  method  has  been  identified  as  clearly  preferable,  particularly 
in  terms  of  predictive  accuracy  and  administrative  feasibility,  to  implement  on  a  systemwide  basis 
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(AAA  1993,  Gauthier  et  al.  1995,  NHPF  1994).  It  is  in  the  context  of  this  lack  of  consensus  that 
reinsurance,  as  opposed  to  relying  on  risk  adjustment  alone,  has  been  proposed  as  an  interim 
measure  to  moderate  the  potential  effects  of  biased  selection  into  health  plans  as  part  of  both  state 
and  federal  health  system  reform  (NHPF  1994,  PPRC  1994,  AAA  1993,  WHTF  1993,  Bovbjerg 
1992,  GAO  1992,  Schramm  1992). 

REINSURANCE 

Reinsurance  is  a  formal  device  to  shift  (or  cede)  some  portion  of  an  insurance  risk  from  a  primary 
carrier  to  a  reinsurer.  It  is  traditionally  used  to  improve  the  underwriting  capacity  of  insurers — that 
is,  the  extent  to  which  they  can  assume  liability — and  to  stabilize  the  loss  expectations  associated 
with  defined  risks.  Under  such  arrangements,  the  financial  risk  that  insurers  assume  might  be 
limited  to  some  predetermined  cost  ceiling,  or  it  might  be  shared  on  a  proportional  basis  with  the 
reinsurer,  depending  on  the  terms  of  the  agreement  (Kramer  1980,  Baker  1980). 

Reinsurance  is  typically  used  in  health  care  financing  to  control  the  effects  of  "outlier"  (high  cost) 
cases  (Bovbjerg  1992).  Individual-level  stoploss  coverage,  for  instance,  involves  setting  a 
threshold  for  health  service  costs  above  which  some  portion  of  direct  expenses  are  covered  though 
a  reinsurer's  pool  of  funds.  All  enrollees  in  a  health  plan  are  covered  by  the  pool  on  a  per-risk 
premium  basis  so  that  the  cost  of  the  reinsurance  coverage  is  spread  across  the  population  as  a 
whole.  Reconciliation  is  made  retrospectively.  Through  this  type  of  agreement,  the  insurer  can 
more  dependably  project  overall  losses  in  return  for  a  set  reinsurance  payment. 

Pro  rata,  or  proportional,  forms  of  reinsurance  involve  sharing  both  premium  income  and  losses  on 
a  proportional  basis  between  a  primary  insurer  and  a  reinsurer.  These  agreements  are  not 
commonly  used  by  private-sector  health  plans  (Bovbjerg  1992),  although  some  pro  rata-like 
methods  to  limit  provider  risk  have  been  proposed  as  part  of  state  or  federal  government  health 
system  reform.  One  method  involves  reinsuring  specific  high-risk  individuals  or  groups,  or 
treatments  for  specific  diagnoses.  In  Connecticut,  for  example,  high-risk  cases  in  the  small-group 
and  individual  market  are  identified  in  advance  for  a  high-risk  coverage  program.  Premium 
payments  are  made  by  the  insurer  to  a  reinsurance  pool.  At  the  end  of  a  contract  period,  costs  in 
excess  of  premium  income  to  the  reinsurance  pool  are  shared  on  a  pro  rata  basis  among  those  who 
participate  in  the  pool.  Such  programs,  essentially,  segment  (or  "carve-out")  specific  types  of  risks 
to  concentrate  the  economic  impact  of  those  risks  more  narrowly  than  does  the  stoploss  coverage 
just  described. 

There  is  relatively  limited  incentive  to  manage  high-risk  cases  efficiently  under  this  type  of  program 
because  the  costs  of  that  care  are  distributed  widely  among  participating  insurers.  One  alternative 
enacted  in  New  York  State  is  to  establish  a  pool  for  specific  high-cost  diagnoses.  Standardized 
payments  are  made  on  a  prospective  basis  to  the  primary  carrier.  Because  those  carriers  do  not 
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receive  reimbursement  based  on  their  actual  expenses,  they  have  the  incentive  to  provide  related 
services  as  efficiently  as  possible  (NYDI  1992,  PPRC  1995).  Overall  costs  associated  with  the 
reinsurance  pool  are  distributed  across  participating  carriers. 

As  discussed  above,  risk  adjustment  reduces  providers'  incentive  to  select  only  good  medical  risks 
by  (ideally)  accounting  and  compensating  for  the  risk  profile  of  their  enrolled  population  relative  to 
those  enrolled  in  other  plans.  The  financial  risk  that  plans  then  retain  is  a  function  of  their  ability  to 
provide  care  efficiently  rather  than  the  nature  of  the  particular  medical  risks  they  insure.  By 
contrast,  stoploss  reinsurance  simply  lowers  the  overall  threshold  of  financial  risk  a  primary  insurer 
assumes.  It  does  not  directly  address  the  incentive  for  selection  bias  in  capitated  health  plans 
because  there  is  no  individual-level  accounting  for  the  risk  profile  of  the  reinsured  population. 
Forms  of  reinsurance  that  entail  removing  specific  high-cost  cases  from  the  general  coverage 
population  may  reduce  the  incentive  to  select  against  those  particular  high-cost  cases.  However, 
the  incentive  for  biased  selection  in  the  general  population  remains,  even  in  the  absence  those 
specific  high-cost  cases,  unless  some  effort  is  made  to  adjust  for  remaining  risk  differences. 

One  compelling  rational  for  reinsurance  is  the  contribution  it  can  make  in  stabilizing  the  calculation 
of  expected  values  that  underlie  capitation  payment  rates.  The  primary  reason  for  including 
stoploss  provisions,  in  particular,  under  health  reform  is  that  they  make  the  distribution  of  cost 
outcomes  per  person  more  dense  by  eliminating  outcomes  that  are  much  greater  than  the  average 
for  any  given  risk  adjustment  category.  This  reduces  the  variance  otherwise  associated  the 
calculating  payment  rates,  and  may  reduce  providers'  incentive  to  identify  good  and  bad  risks 
within  categories  (WHTF  1993).  In  this  sense,  stoploss  reinsurance  can  be  considered  in  the  same 
context  as  other  statistical  methods,  or  techniques,  used  to  analyze  medical  cost  data. 

The  distribution  of  health  service  cost  data  for  a  random  population  is  typically  characterized  by  a 
concentration  of  low  or  no  cost  cases,  a  diminishing  frequency  of  cases  as  costs  rise,  and  a  few  very 
high  cost  cases.  Researchers  will  commonly  transform  such  data  using  the  log  of  costs  to  correct 
for  that  skewness  (Shwartz  and  Ash  1994).  They  may  also  employ  "multipart"  models  to  treat 
those  who  use  services  separately  from  those  who  do  not  (Duan  et  al.  1982,  Rossiter  et  al.  1994, 
Robinson  et  al.  1991,  Wouters  1991).  Alternatively,  some  researchers  trim  or  truncate  outlier 
cases.  Newhouse  et  al.  (1989,  1993 A),  for  example,  trimmed  outlier  cases,  setting  charges  above 
the  98th  percentile  to  the  mean  of  the  highest  2  percent  of  charges  to  preserve  the  overall  mean  of 
medical  charges.  Anderson  et  al.  (1990)  truncated  expenditures  at  the  99th  percentile  because  the 
upper  1  percent  were  deemed  too  difficult  for  any  model  to  predict.  Robinson  et  al.  (1991) 
truncated  charges  at  $25,000  per  person  under  the  assumption  that  charges  above  that  level  were 
fundamentally  unpredictable.  Weiner  et  al.  (1994)  also  truncated  charges  at  $25,000  per  case  to 
limit  the  effects  of  outliers  and  to  reflect  stoploss  coverage  then  used  in  the  health  plans  they 
studied.  Dunn  et  al.  (1995)  truncated  expenditures  at  $25,000  in  their  main  analyses,  and  at 
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$50,000  in  a  sensitivity  analysis  under  similar  assumptions.  Reflecting  stoploss  reinsurance  levels 
when  modeling  payments  for  health  services  is  one  way  to  explicitly  incorporate  a  financial 
mechanism  that  is  already  commonly  used  to  control  the  effects  of  outlier  cases. 

Once  again,  however,  the  statistical  effect  of  such  reinsurance  would  simply  be  to  adjust  the  mean 
and  variance  of  the  distribution  used  to  calculate  payment  rates.  That  is  not  likely  to  be  sufficient 
to  eliminate  the  premium  differential  attributable  to  health  status.  Moreover,  reinsurance  may 
dilute  incentives  to  manage  care  effectively  if  the  threshold  of  risk  is  set  too  low.  Once  a  stoploss 
threshold  is  reached  in  any  given  case,  the  financial  risk  to  the  provider  is  limited  to  a  copayment,  if 
any,  above  that  threshold.  As  a  general  rule,  the  stoploss  level  established  for  a  primary  carrier 
should  be  set  no  lower  than  the  point  at  which  associated  losses  become  unpredictable  (Baker 
1980).  In  practice,  that  level  is  a  function  of  plan  size,  reinsurance  claim  experience,  and  plan 
management's  aversion  to  risk  (Bovbjerg  1992,  Ward  1993).  Low  reinsurance  thresholds  may  also 
require  a  considerable  level  of  claim  review  and  represent  a  trade-off  between  lowered  risk  and 
administrative  efficiency  (VVHTF  1993,  Ward  1993). 

RISK  ADJUSTMENT  AND  REINSURANCE 

One  point  of  consensus  that  has  emerged  from  recent  health  reform  debate  is  that  approaches  to 
increase  health  insurance  coverage  and  contain  costs  should  build  on  the  nation's  current  insurance 
system  (Gauthier  et  al.  1995).  Reinsurance,  which  is  commonly  required  in  HMOs  as  a  condition 
of  state  licensure  (Ward  1993),  is  an  important  component  of  that  system.  Reinsurance  could  be 
used  in  combination  with  emerging  risk-adjustment  methods  to  moderate  risk  to  providers  who 
accept  capitation  payment  rates.  Very  little  research  has  been  published  that  explores  how 
reinsurance  may  effect  the  application  of  the  types  of  risk  adjustment  methods  that  are  now  being 
considered  under  health  reform.  How,  for  example,  might  different  reinsurance  thresholds  effect 
the  relative  assessment  of  risk  adjustment  models?  Would  a  lower  stoploss  threshold  on  a  simple 
demographic  model  achieve  comparable  predictive  accuracy  to  that  of  a  more  sophisticated  model 
with  a  higher  threshold? 

It  is  useful  to  note  that  there  is  some  disagreement  regarding  the  most  appropriate  way  to  assess  the 
relative  performance  of  risk  adjustment  methods.  To  date,  the  most  commonly  used  criterion  is  the 
R-square  (R2)  associated  with  calculating  expected  values  at  the  individual  level  (Epstein  and 
Cumella  1988,  Shwartz  and  Ash  1994).  Newhouse  (1994),  in  particular,  has  focused  on  this 
measure  under  the  assumption  that  health  plans  operationalize  risk-selection  behavior — that  is, 
selectively  encouraging  and  discouraging  good  and  bad-risk  patients — based  on  individual 
differences. 

Other  researchers  have  suggested  that  focusing  on  group-level  measures  of  predictive  accuracy  is  a 
more  appropriate  way  to  assess  risk-adjustment  methods,  in  part  because  they  reflect  how  those 
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methods  are  actually  applied  (Hornbrook  and  Goodman  1994,  Rossiter  et  al.  1994,  Welch  1985). 
Both  risk  adjustment  and  the  business  of  insurance,  generally,  are  group-level  processes.  They 
require  a  sufficiently  large  population  across  which  to  establish  appropriate  adjustment  factors — in 
the  case  of  risk  adjustment — or  to  spread  assumable  risk — in  the  case  of  insurance.  While  the  R2 
has  been  routinely  associated  with  model  development,  group-level  measures  have  more  typically 
been  reported  in  the  context  of  the  practical,  or  simulated,  application  of  risk  adjustment  methods 
for  payment  purposes.  As  such,  generating  group-level  measures  involves  a  set  of  issues  that 
individual-level  measures  do  not  commonly  reflect.  Those  issues  include,  but  are  not  limited  to, 
selection  effects  into  groups,  changes  in  practice  patterns  within  health  plans,  and  inflation  factors. 

In  addition  to  any  potential  contribution  to  predictive  accuracy,  reinsurance  may  also  affect  the 
incentive  health  plans  retain  to  provide  services  efficiently.  A  very  low  stoploss  threshold  may 
undermine  that  incentive.  A  relatively  high  threshold  may  not  remove  enough  of  the  variation  due 
to  high-cost  cases  to  benefit  smaller  plans,  or  those  more  averse  to  such  risk.  Finally,  some 
accounting  also  needs  to  be  made  of  the  trade-off  reinsurance  entails  between  limiting  risk  and 
contributing  to  the  administrative  burden  health  plans  and  providers  experience.  A  modest 
improvement  in  predictive  accuracy  may  come  at  the  expense  of  burdensome  administrative 
oversight  when  a  significant  number  of  claims  reach  the  threshold  level.  Conversely,  the  added 
claim  review  associated  with  a  lower  threshold  on  a  simple  demographic  model  may  offset  the  more 
demanding  data  requirements  of  claim-based  risk  adjustment  methods. 

The  general  goal  of  this  study  is  to  get  a  better  understanding  of  the  relationship  between  risk 
limitation  associated  with  reinsurance  and  risk  adjustment  used  to  account  for  differences  in  the 
distribution  of  health  service  costs.  The  specific  aims  of  this  study  are: 

•  to  examine  the  contribution  that  various  stoploss  reinsurance  options  make  to  the  relative 
performance  of  risk  adjustment  methods; 

•  to  compare  the  statistical  effects,  and  resultant  predictive  accuracy,  of  reinsurance 
modeling  and  common  mathematical  treatments,  such  as  log  transformation  and  multipart 
methods,  used  to  improve  the  dependability  of  inferences  made  from  risk  adjustment 
models;  and, 

•  to  explore  the  relationship  between  various  stoploss  reinsurance  thresholds  and  subsequent 
changes  in  group  (versus  individual)  measures  of  model  performance,  given  a  selection  of 
risk  adjustment  methods. 

In  this  study,  data  drawn  from  2  moderately-sized  IPA-model  HMOs  will  be  used  to  simulate  risk- 
adjusted  capitation  payments  made  to  providers.  While  the  primary  focus  is  on  capitation  rate 
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assignment  to  providers,  the  results  of  this  study  should  have  parallel  implications  for  risk-adjusted 
transfer  payments  made  in  the  context  of  community-rated  premiums  under  many  state  health 
reform  programs  as  well.  While  the  incentives  for  consumers  and  providers  are  clearly  different  in 
each  case,  the  mechanism  of  risk  adjustment  is  essentially  the  same  whether  it  is  applied  at  the  level 
of  payments  to  providers  or  in  the  process  of  adjusting  community-rated  premiums  (Bowen  1994). 

Versions  of  four  risk  adjustment  methods  that  are  routinely  suggested  in  the  literature  on  risk 
adjustment  will  be  applied  to  populations  of  non-Medicare  enrollees  (AAA  1993,  GAO  1994B, 
WHTF  1993).  The  methods  include:  simple  demographics,  a  chronic  condition  flag,  the 
Ambulatory  Care  Group  case-mix  system,  and  Ambulatory  Diagnostic  Groups.  Actual-dollar 
(untransformed),  as  well  as  1,2,  and  4-part  log-transformed  versions  of  those  models  will  be 
calculated.  Categorical  and  continuous  versions  of  age  will  be  defined.  Four  truncation  levels 
reflecting  stoploss  thresholds  will  be  used. 

Once  the  various  versions  and  iterations  of  risk-adjustment  methods  are  calculated,  the  relative 
performance  of  each  will  be  compared  using  individual  and  group-level  measures.  In  keeping  with 
much  of  the  literature  associated  with  the  development  of  risk-adjustment  methods,  the  primary 
individual-level  assessment  criterion  will  be  the  adjusted  R2  (Epstein  and  Cumella  1988,  Shwartz 
and  Ash  1994).  The  R2  reflects  the  extent  to  which  the  collection  of  independent  variables  in  a 
given  model  succeeds  in  explaining,  or  accounting  for,  the  overall  variation  in  the  dependent 
measure  (service  costs).  It  is  typically  adjusted  to  reflect  the  number  of  risk  factors  included  as 
independent  variables  in  the  underlying  calculation,  particularly  when  the  population  used  for 
analysis  is  relatively  small.  Other  individual-level  measures  of  performance  will  be  drawn  from 
comparisons  of  actual  and  expected  values  to  assess  the  predictive  accuracy  of  alternative  methods. 

Most  previous  studies  that  include  group-level  measures  of  model  performance  rely  on  random 
and/or  systematically  skewed  aggregations  of  patients  (Anderson  et  al.  1990,  Dunn  et  al.  1995, 
Hayes  1991,  Hornbrook  et  al.  199 IB,  Hornbrook  and  Goodman  1995,  Fowles  et  al.  1995, 
Robinson  et  al.  1991).  This  study  is  unusual  in  that  it  is  designed  around  naturally  occurring 
aggregations  of  patients  within  the  two  study  sites  that  make  it  possible  to  form  groups  that  reflect 
some  level  of  the  selection  bias  that  might  occur  in  actual  practice.  Primary  care  physician 
assignment  will  be  used  to  establish  groups  of  enrollees  of  various  sizes.  Group-level  assessment 
criteria  will  include  mean  forecasting  bias,  the  mean  squared  forecasting  error  of  that  bias,  and 
other  measures  of  how  well  payments  for  groups  of  enrollees  reflect  actual  costs,  given  alternative 
risk  adjustment  methods.  While  this  study  will  focus  primarily  on  measures  of  predictive  accuracy, 
other  assessment  criteria,  such  as  administrative  feasibility,  will  also  be  examined. 

This  study  is  not  intended  to  develop  a  specific  risk  adjustment  model.  Instead,  it  seeks  to  examine 
how  such  models  fare  in  the  presence  of  one  very  common  risk  limitation  mechanism  when  they  are 
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used  to  set  prospective  payment  rates.  From  a  policy  perspective,  results  from  this  study  may  most 
appropriately  be  read  in  the  context  of  what  Luft  (1995)  and  others  (Bowen  1995,  Gauthier  1995) 
have  suggested  is  a  need  to  shift  the  focus  from  a  search  for  the  perfect  risk  adjustment  model  to 
the  process  of  of  applying  risk  adjustment  for  payment.  Reinsurance  is  likely  to  be  a  component  of 
that  process. 

In  summary,  the  distribution  and  management  of  financial  risk  associated  with  providing  health 
care  services  is  directly  related  to  how  well  the  overall  health  care  system  meets  the  needs  of  a  wide 
spectrum  of  health  care  risks,  from  the  working  healthy  to  the  chronically  ill.  As  providers  are 
asked  to  assume  more  of  that  risk,  at  least  two  critical  issues  need  to  be  addressed  to  ensure  that 
those  providers  are  fairly  and  adequately  protected  from  excessive  financial  risk.  With  respect  to 
fairness,  risk-based  payments  should  account  for  health  care  differences  in  populations  of  covered 
individuals.  Risk  adjustment  methods  that  accomplish  that  accounting,  and  their  application,  are 
still  evolving.  At  the  same  time,  some  method  is  needed  to  adequately  limit  the  extreme  variation 
in  costs  associated  with  rendering  health  care  services  to  help  ensure  the  solvency  of  risk-bearing 
providers.  If  reinsurance  can  be  shown  to  be  an  effective  buffer  for  provider  risk  in  both  a 
statistical  and  a  practical  sense,  it  could  help  facilitate  the  implementation  of  risk  adjustment 
methods  more  generally.  In  providing  a  temporary  ceiling  for  providers  at  risk,  reinsurance  may 
also  make  it  possible  to  assess  the  accuracy  and  administrative  feasibility  of  proposed  risk- 
adjustment  methods  as  they  are  initially  applied  in  actual  practice.  The  results  of  this  study  are 
intended  to  shed  light  on  the  degree,  if  any,  to  which  various  levels  of  stoploss  reinsurance 
contribute  to  the  reduction  of  risk  providers  assume  under  otherwise  risk-adjusted  payment 
systems.  To  the  extent  to  which  it  can  be  used  in  such  a  context,  reinsurance  may  help  ensure 
access  to  health  care  services  for  those,  such  as  the  chronically  ill,  who  might  otherwise  remain 
inadequately  served  in  a  health  care  market  that  has  evolved  to  avoid  unattractive  risks. 
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CHAPTER  2 

BACKGROUND  ON  RISK  ADJUSTMENT,  REINSURANCE,  AND  METHODS 

This  chapter  is  intended  as  a  more  detailed  review  of  the  background  and  literature  regarding  risk 
adjustment  and  reinsurance.  More  specifically,  it  is  written  to  examine  a  collection  of  analytical 
issues  that  need  to  be  considered  in  order  to  address  the  underlying  questions  posed  for  this  study  in 
Chapter  1.  First,  what  general  criteria  are  used  to  assess  the  applicability  of  risk  adjustment 
methods,  particularly  within  the  context  of  efforts  to  reform  the  health  care  system?  Then,  what 
specific  methods  have  been  proposed  given  those  criteria?  Next,  what  role  does  reinsurance  play  in 
the  health  care  market,  and  how  is  it  typically  defined?  Finally,  what  are  the  methodological  and 
statistical  issues  involved  in  applying  and  evaluating  methods  used  to  set  prospective  health  service 
payment  rates? 

RISK  ASSESSMENT  AND  RISK  ADJUSTMENT 

Risk  assessment,  or  classification,  is  the  process  of  determining  the  relative  risks  of  subsets  of 
individuals  within  a  defined  population.  In  the  context  of  health  reform  and  rating,  it  entails 
modeling  and  calculating  the  expected  expenses  of  one  class  of  person  or  persons  relative  to  others. 
A  standardized  measure  of  risk  can  be  thought  of  as  a  scale  of  relative  values,  where  the  population 
average  expected  expense  is  defined  as  one  ( 1 .00).  The  expected  expense,  or  financial  risk, 
associated  with  subsets  of  individuals  can  then  be  expressed  in  terms  of  deviation  from  that 
average.  A  subset  of  the  population  with  a  risk  factor  of  2  .00,  for  example,  is  expected  to  generate 
twice  the  expenses  of  the  population  as  a  whole.  Those  with  a  factor  of  .50  are  expected,  on 
average,  to  generate  half  the  expense  of  the  same  overall  population  (Bowen  1995,  Hornbrook  and 
Goodman  1991A,  McClure  1984). 

Risk  adjustment  involves  applying  that  assessment  in  the  process  of  modifying  payments  to  health 
plans  to  compensate  for  risk  deriving  from  characteristics  of  enrollees  that  are  not  otherwise  under 
the  control  of  the  plan.  In  other  words,  risk-adjusted  payments  are  intended  to  help  ensure  that 
provider  plans  that  enroll  lower  or  higher  than  average-risk  members  receive  reimbursement  that 
reflects  the  overall  risk  of  the  plan's  members  (Hornbrook  and  Goodman  1991  A,  McClure  1984). 

Criteria  for  Assessing  Risk  Adjustment  Methods 

The  process  of  applying  risk  adjustment  methods  in  setting  payment  rates  requires,  first,  a 
methodology  with  which  to  make  the  adjustment,  but  also  some  accounting  of  the  context  within 
which  the  rate-setting  takes  place.  Table  2. 1  shows  an  array  of  criteria  for  assessing  risk 
adjustment  methods  in  the  context  of  health  system  reform  that  have  been  suggested  by  analysts 
over  the  past  ten  or  more  years.  The  articles  were  selected  to  reflect  a  variety  of  policy  and  research 
perspectives.  Analysts  are  listed,  left  to  right,  in  columns  from  most  to  least  recent.  Some  criteria 
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were  subsumed  in  the  elaboration  of  others.  Other  criteria,  such  as  confidentiality,  were  of  more 
general  concern  to  analysts  but  were  not  specifically  listed  to  assess  risk  adjustment  methods. 

The  first  criterion,  validity,  reflects  the  extent  to  which  a  method  measures  what  it  is  suppose  to 
measure.  It  is  an  overall  assessment  of  a  method's  value  or  appropriateness  to  meet  specific  goals. 
Daley  (1994)  describes  it  as  a  multidisciplinary  concept  encompassing  both  technical  (data  and 
measurement)  and  contextual  (justifiability)  considerations.  Predictive  accuracy,  for  example,  is 
the  primary — though  sometimes  unstated — goal  of  any  risk  adjustment  method  (Thomas  et  al. 
1983,  R.  Anderson  1991,  Hornbrook  and  Goodman  1991).  Each  of  the  other  criteria  in  Table  2. 1 
reflect  some  aspect  of  the  circumstances  under  which  any  particular  method  achieves  such  a  goal. 
A  method  might  be  valid  given  the  criterion  of  predictive  accuracy,  but  not  administratively 
feasible.  Thus,  there  are  many  different  types  of  validity,  as  it  can  be  assessed  relative  to  any  of  the 
criteria  listed.  Any  method  that  meets  all  of  the  other  criteria  will,  by  definition,  be  valid.  While 
many  analysts  don't  specifically  list  validity  as  a  criterion,  it  can  be  assumed  to  underlie  the 
discussion  of  those  criteria  they  do  examine  as  a  whole. 

The  three  most  commonly  cited  criteria  for  assessing  risk  adjustment  methods  are  predictive 
accuracy,  administrative  feasibility,  and  gameability.  Predictive  accuracy  was  cited  in  some  form 
by  all  of  the  analysts  included  in  Table  2  .1.  This  criterion  is  a  measure  of  how  well  any  given 
method  predicts  future  health  care  costs  for  a  population.  It  is  typically  discussed  in  the  context  of 
statistical  bias  that  might  be  associated  with  the  assignment  of  relative  expenditure  values  to 
classes  of  individuals. 

Methods  that  rely  on  very  heterogenous  classes  (in  terms  of  health  risks)  may,  potentially,  over  or 
under-predict  values  for  subgroups  within  those  classes  leading  to  some  consistent  bias.  For 
example,  payment  rates  based  on  age  and  gender  alone  encompasses  a  continuum  of  health  risks 
from  very  well  to  chronically  sick  within  each  risk  class.  Because  those  rates  reflect  the  average 
expected  usage  for  each  class  of  individuals  regardless  of  health  risk,  there  are  significant 
opportunities  for  selection  within  risk  groups.  Many  of  the  analysts  emphasized  the  need  for 
reasonably  homogenous  risk  classes.  Expenditures  may  vary  considerably  within  more 
homogenous  risk  classes,  but  that  variation  should,  ideally,  be  a  function  of  provider  efficiency 
rather  than  patient  health  risk  (Epstein  and  Cumella  1988,  McClure  1984,  Hornbrook  and 
Goodman  1991). 

While  expenditure  expectations  are  assigned  to  each  individual  relative  to  the  population  as  a 
whole,  the  predictive  accuracy  of  those  expectations  can  be  assessed  at  both  the  individual  and 
group  levels  of  their  application.  Individual-level  measures  tend  to  suggest  how  well  a  given 
method  explains  the  distribution  of  risk  in  an  underlying  population.  Group-level  measures  suggest 
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how  well  a  method  accounts  for  differences  given  some  defined  selection  criterion.  Each  of  these 
types  of  measures  is  discussed  in  more  detail  in  a  later  section. 

Administrative  feasibility  (or  practicality)  was  also  cited  in  some  form  by  all  those  included  in 
Table  2.1.  This  criterion  reflects  the  fact  that  a  model  may  be  technically  accurate  but 
inappropriate  if  it  poses  too  great  a  burden  on  providers,  patients,  or  the  risk-adjusting  agent. 
Considerations  made  in  this  area  include:  the  availability  and  affordability  of  data  required;  the 
nature  and  extent  of  administrative  changes  required  of  health  plans;  and,  the  feasibility,  ease,  and 
cost  of  implementation  and  audit. 

The  third  most  commonly  cited  criterion  is  gameability.  In  one  sense,  this  is  related  to  the  issue  of 
heterogeneity  discussed  above.  A  risk-adjustment  system  should  be  designed  to  minimize  a  health 
plan's  ability  to  identify  good  from  bad  risks  within  risk  classes.  That  will  minimize  a  plan's 
capability  and  incentive  to  "game"  the  system  by  selectively  encouraging  (or  discouraging)  those 
good  (and  bad)  risk  individuals  (WHTF  1993). 

Gameability  also  refers  to  the  extent  to  which  data  that  underlie  a  given  risk-adjustment  method  can 
be  manipulated  by  the  provider  or  the  patient.  Members  of  health  plans  who  are  asked  to  report 
functional  health  status,  for  example,  may  misrepresent  the  level  of  their  functioning  if  they  are 
aware  that  it  may  affect  payment  to  their  plan.  If  payment  rates  are  a  discernable  function  of 
diagnosis,  providers  may  have  the  incentive  to  assign  codes  that  result  in  higher  reimbursement. 
This  issue  is  closely  related  to  the  ease  with  which  the  data  can  be  collected  and  audited  (Lubitz 
1987). 

The  remaining  criteria  in  Table  2. 1  highlight  specific  aspects  of  the  broader  considerations 
discussed  above.  The  criterion  "reflect  appropriate  care"  is  a  corollary  of  the  considerations  of  bias 
that  might  affect  predictive  accuracy.  Risk  adjustment  should  account  for  the  characteristics  of 
enrolled  individuals  rather  than  the  practice  patterns  of  providers.  That  is,  plans  should  not  be 
rewarded  for  underserving  enrollees  or  penalized  for  providing  necessary  services.  In  this  way,  the 
method  should  promote  efficient  care. 

Stability,  or  reliability,  is  also  related  to  predictive  accuracy  and  involves  the  extent  to  which  the 
risk-adjustment  method  is  consistent  in  generating  expectations  across  populations  and  across  time. 
This  criterion  is  related,  for  example,  to  researchers'  concern  for  "overfitting"  in  the  development 
and  assessment  of  such  methods.  Overfitting  is  involved  when  the  calculation  of  expected  values 
capitalizes  unfairly  or  inappropriately  on  some  characteristic  of  the  estimation  data  used  to  generate 
rates  (Hornbrook  and  Goodman  1995,  Shwartz  and  Ash  1994).  A  methods  that  fails  to  produce 
comparable  results  in  different  but  otherwise  similar  settings  may  be  too  finely  tuned  to  some 
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aspect  of  its  development  setting.  A  methods  that  work  better  in  one  type  of  setting  as  oppose  to 
another  may  undermine  other  criteria  such  as  gameability,  political  acceptability,  or  fairness. 

Political  acceptability  is  tangentially  related  to  administrative  feasibility  and  practicality.  It 
addresses  the  overall  question  of  how  various  participants  perceive  the  application  of  risk 
adjustment.  First,  is  the  method  understandable?  Then,  do  health  plans,  the  risk-adjusting  entity, 
patients,  or  individual  providers  assume  a  disproportionate  share  of  the  burden  of  implementing  the 
system?  Is  it  fair,  in  a  political  sense? 

The  criterion  listed  as  reasonableness  and  fairness  really  reflects  both  predictive  accuracy  and 
administrative  feasibility.  The  issue  here  is  that  risk-adjustment  methods  should  produce  fair 
(accurate)  assessments  of  each  participating  health  plan's  risk  burden  within  a  timeframe  that  is 
appropriate  and  acceptable  to  the  plans.  Plans  need  to  be  able  to  predictably  estimate  the  impact  of 
risk-adjustment  transfers  as  an  added  factor  in  their  calculation  of  premium  income  and  expected 
expenditures. 

The  criterion  that  a  risk  adjustment  method  be  comprehensive  suggests  that  it  be  relevant  to  apply 
to  all  potential  participants.  Survey-based  methods,  for  example,  tend  to  exclude  children  and 
those  who  are  not  otherwise  capable  of  responding  to  a  survey.  Such  methods  may  need  to  be 
supplemented  for  those  other  special  cases  (Fowles  et  al.  1994,  Hornbrook  and  Goodman  1995). 

The  last  criterion  listed  in  Table  2. 1  reflects  concern  that  the  privacy  of  patients  is  protected  in  the 
course  of  risk  adjustment.  The  availability  of  the  kind  of  data  that  underlies  many  risk  adjustment 
systems  poses  some  risk  of  imdermining  the  confidentiality  of  patient  records.  The  value  of 
incorporating  patient  service-use  data  in  the  system,  for  example,  may  be  offset  by  an  inability  to 
otherwise  control  access  to  that  information.  Patients,  and  even  providers,  may  be  induced  to  find 
ways  to  mask  such  information  and,  potentially,  undermine  the  risk  adjustment  system. 

Risk  Adjustment  Methods 

Age  and  gender  have  been  used  as  the  basis  for  adjusting  relative  risk  in  insurance  rating  for  some 
time  (Wrightson  1990,  Reichard  1995.)  Simple  demographic  factors  are  administratively  feasible 
to  apply  and  are  commonly  accepted  as  appropriate  risk  adjusters.  Very  young  and  older  persons 
tend  to  use  more  services  than  others  and  women  tend  use  more  services  than  men  during 
childbearing  years  (Hornbrook  et  al.  1 991  A,  Browne  and  Doerpinghaus  1993.)  The  biggest 
problem  with  using  these  factors  alone  in  risk-adjusted  rating  is  that  they  do  not  reflect  the  actual 
health  risks  of  a  population.  Consequently,  there  may  be  considerable  risk  variation  within  classes 
that  could  provide  an  incentive  to  select  better  within-class  risks. 
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Building,  in  large  part,  on  their  general  acceptance,  the  adjusted  average  per  capita  cost  (AAPCC) 
used  as  the  basis  for  risk-based  payments  to  HMOs  under  the  federal  Medicare  program  is  derived 
from  demographic  and  sociodemographic  factors.  The  AAPCC  includes  adjustment  for  age,  sex, 
institutional  status,  welfare  status,  and  geographic  location  (Palsbo  1989.)  This  formulation  is 
particularly  significant  because  it  is  the  only  risk-adjustment  method  currently  used  on  a  nationwide 
basis  under  government  programs.  As  noted  in  Chapter  1  however,  the  factors  included  in  the 
adjustment  routinely  account  for  a  very  small  percentage  of  observed  variation  in  payments 
(Epstein  and  Cumella  1988,  Lubitz  et  al.  1985,  Newhouse  1986,  Newhouse  et  al.  1989,  Anderson 
et  al.  1990 A,  GAO  1994A.)  The  AAPCC  also  has  been  criticized  for  the  administrative  difficulty 
in  defining  and  establishing  welfare  and  institutional  status  as  well  as  the  fee-for-service  basis  of 
the  costs  used  to  calculate  risk-based  payments  (Palsbo  1989,  GAO  1994B). 

Given  its  prominence,  a  considerable  body  of  work  has  been  devoted  to  improving  the  current 
Medicare  risk-based  payment  formula  (Anderson  et  al.  1986B,  Anderson  et  al.  1989,  Ash  et  al. 
1989,  Epstein  and  Cumella  1988,  GAO  1994A,  Lichtenstein  and  Thomas  1987A,  Lubitz  et  al. 
1985,  Lubitz  1987,  Manton  et  al.  1994,  Newhouse  et  al.  1989,  Newhouse  et  al.  1993A,  Thomas  et 
al.  1983,  Thomas  and  Lichtenstein  1986A  &  B).  NonMedicare  populations  have  been  treated  in 
some  more  recent  work  (Hayes  1991,  Hornbrook  et  al.  1991A  &  B,  Hornbrook  and  Goodman 
1995,  Robinson  et  al.  1991,  Weiner  et  al.  1991).  In  each  of  these  instances,  however,  the  primary 
focus  has  been  to  develop  alternative  risk  adjustment  mechanisms  that  incorporate  some  measure  of 
health  status  (Thomas  et  al.  1983,  Lubitz  1987,  GAO  1994A.)  In  addition  to  sociodemographic 
factors,  these  efforts  can  be  categorized  as  reflecting  measures  of  prior  service  use,  indicators  of 
clinical  conditions  drawn  from  service  use  data,  and  data  gathered  from  patients. 

Prior  use.  Risk  adjustors  based  on  measures  of  prior  use  reflect  payment  levels  for  services  used 
by  health  plan  enrollees  in  a  previous  period  to  derive  prospective  payment  amounts.  Thus,  they 
are  indirect  measures  of  health  status.  Much  like  standard  demographic  information,  measures  of 
prior  use  can  be  drawn  (relatively  inexpensively)  from  existing  administrative  sources  such  as 
claim  data  (Anderson  et  al.  1989,  Beebe  et  al.  1985). 

Research  has  shown  that  information  on  prior  service  use  can  substantially  improve  the  predictive 
accuracy  of  sociodemographic  models  such  as  the  AAPCC  (Anderson  and  Knickman  1984B, 
McCall  and  Wai  1983,  Roos  and  Shapiro  1981).  This  is  particularly  true  for  those  among  the  aged 
with  high  prior  expenses  (Anderson  and  Knickman  1984A,  Freeborn  et  al.  1990,  McCall  and  Wai 
1983).  Non-aged  employed  populations,  as  a  whole,  exhibit  more  modest  related  improvement  in 
the  predictive  accuracy  of  risk-adjustment  models  (Goodman  et  al.  1991,  Hornbrook  et  al. 
1991  A) — in  part,  because  of  the  lower  incidence  of  chronic  conditions  in  those  populations. 
Nevertheless,  measures  of  prior  use  do,  generally,  explain  more  of  the  statistical  variation  in 
payments  than  do  demographic  factors  alone,  and  in  that  sense  are  promising  for  further 
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development  and  use  (Epstein  and  Cumella  1988,  Lubitz  1987,  Newhouse  1986,  Thomas  and 
Lichtenstein  1986B). 

An  important  theoretical  disadvantage  of  such  measures  is  that  they  provide  incentive  to  deliver 
more  services  than  might  otherwise  be  efficient  since  the  payment  for  future  services  would  be 
based  on  the  level  of  current,  or  previous,  service  use  (Hornbrook  and  Goodman  1991,  1995). 
Moreover,  prior  use  capitalizes  on  provider  practice  habits  which  are  not  directly  related  to  patient 
health  status  (McClure  1984).  This  may  help  sustain  inefficient  treatment  patterns  rather  than 
encourage  efficient  care  (Newhouse  1986).  Finally,  measures  of  prior  use  may  be  subject  to 
gaming.  Some  risk  adjustment  methods  specifically  include  some  accounting  of  provider  discretion 
to  mitigate  the  incentive  to  generate  service  use  (Anderson  et  al.  1989,  Ash  et  al.  1989). 

Clinical  indicators.  As  discussed  in  Chapter  1,  only  an  estimated  20  percent  of  variation  in  health 
care  costs  is  predictable  from  year  to  year  (McCall  and  Wai  1983,  Newhouse  et  al.  1989,  Van  Vliet 
1992).  Much  of  that  variation  is  attributable  to  chronic  conditions  and  the  propensity  to  use 
services  (Welch  1985).  Chronic  diseases,  in  particular,  are  markers  of  medical  risk  because  they 
represent  ongoing  need  for  medical  services  (Hornbrook  and  Goodman  1994,  Kronick  et  al.  1995). 
Researchers  have  proposed  to  account  for  that  risk  by  including  clinical  indicators  in  the  risk 
adjustment  model  (Anderson  et  al.  1986A,  Epstein  and  Cumella  1988,  Howland  et  al.  1987, 
McClure  1984).  Such  indicators  can  be  derived  from  clinical  diagnoses  reported  in  claim  files  and 
medical  records  to  reflect  specific  diseases  or  chronic  conditions,  for  example.  The  same  data 
sources  can  be  used  to  create  related  indicators  such  as  the  number  and  type  of  hospitalizations 
during  a  period  or  the  mortality  of  plan  enrollees  (Anderson  et  al.  1990B,  GAO  1994 A,  Hornbrook 
and  Goodman  1991,  Lubitz  et  al.  1987). 

What  the  Government  Accounting  Office  (1994 A)  described  as  a  more  sophisticated  method  of 
including  clinical  diagnoses  in  risk  adjustment  entails  accounting  for  both  the  presence  or  absence 
of  a  particular  disease  and  the  severity  of  conditions  reflected  in  the  body  of  diagnoses  reported  for 
any  given  individual.  The  Ambulatory  Care  Group  (ACG)  case-mix  system  is  one  of  the  most 
commonly  cited  example  of  such  a  system  used  in  risk-adjustment  (AAA  1993,  GAO  1994 A, 
PPRC  1994,  WHTF  1993). 

ACGs  incorporate  an  approach  for  clustering  ICD-9-CM  diagnosis  codes  to  derive  health  status 
information  from  existing  insurance  claim  data  sources  on  the  premise  that  a  measure  of  a 
population's  "illness  burden"  can  help  explain  variation  in  health  care  consumption.  Individuals  are 
categorized  based  on  age,  gender,  and  diagnoses  assigned  by  their  providers  during  contact  with  the 
delivery  system  over  a  specified  period  of  time,  such  as  a  year.  The  categorization  reflects  the  need 
for  specialty  care,  the  potential  for  hospitalization,  whether  a  condition  can  be  expected  to  persist, 
the  likelihood  of  disability  or  death,  and  the  expected  cost  of  care  (Weiner  et  al.  1991).  ACGs  were 
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developed  to  incorporate  diagnoses  assigned  in  ambulatory  settings  using  nonMedicare 
populations.  Subsequent  research  has  applied  the  system  using  diagnoses  drawn  from  all 
(ambulatory  and  inpatient)  clinical  settings  (Fowles  et  al.  1994,  Tucker  et  al.  1996.) 

The  ACG  system  also  identifies  Administrative  Diagnostic  Groups  (ADGs)  as  a  preliminary  step 
in  the  clustering  process  used  to  group  individuals.  ADGs  are  a  series  of  indicators  (flags)  that 
record  the  presence  of  categories  of  conditions  such  as  "Likely  to  Recur:  Discrete"  (ADG  7)  and 
"Allergies"  (ADG  5).  ADGs  can  also  be  grouped  to  reflect  chronic  and  nonchronic  conditions 
(Starfield  et  al.  1991).  While  they  were  not  designed  for  the  purpose  of  risk  adjustment  of  health 
service  payments,  and  have  not  been  tested  to  the  extent  other  commonly  suggested  risk  adjustment 
methods  have  been,  their  is  some  preliminary  evidence  that  they  may  be  useful  for  adjusting 
payment  rates  (Dunn  et  al.  1995,  Fowles  et  al.  1994,  Weiner  et  al.  1994). 

Other  risk  adjustment  methods  that  are  commonly  cited  in  recent  reviews  of  risk  adjustment 
represent  combinations  of  clinical  indicators  and  prior  use  measures  (AAA  1993,  GAO  1994A, 
PPRC  1994,  WHTF  1993).  The  Diagnostic  Cost  Group  (DCG)  system  was  initially  designed 
using  inpatient  diagnoses  and  the  length  of  hospitalization  to  aggregate  individuals  into  subgroups. 
Some  more  recent  variations  incorporate  both  inpatient  and  ambulatory  diagnoses.  DCG 
assignment  is  also  adjusted  for  the  extent  of  physician  discretion  involved  in  a  given  hospital 
admission.  Diagnoses  were  rated  to  reflect  the  probability  that  they  might  be  treated  in  an 
outpatient  setting,  the  likelihood  of  inappropriate  admission  or  length  of  stay,  and  the  extent  to 
which  they  might  be  manipulated.  Accounting  for  discretion  in  hospitalization  is  one  means  to 
guard  against  physicians'  ability  to  game  the  system  by  assigning  more  profitable  diagnosis  codes 
(Ash  et  al.  1989,  Dunn  et  al.  1995,  GAO  1994A). 

Payment  Amount  for  Capitated  Systems  (PACS)  methodology  uses  age,  gender,  disability  status, 
the  chronicity  of  hospitalizations,  major  diagnostic  categories,  as  well  as  the  level  of  ambulatory 
resource  use  in  the  base  year  (Anderson  et  al.  1989).  Aside  from  incorporating  an  explicit  measure 
of  ambulatory  resource  use,  PACS  provide  more  explicit  differentiation  between  chronic  and  acute 
conditions  than  does  the  DCG  system.  The  PACS  system  also  adjusts  for  multiple  admissions  in 
the  base  year.  There  is  at  least  modest  evidence  that  PACS  out-perform  DCGs  as  a  means  to  set 
capitation  rates  (Anderson  et  al.  1990,  Van  De  Ven  and  Van  Vliet,  1993). 

Both  PACS  and  DCGs  were  developed  and  tested  using  Medicare  data  sources.  They  were  also 
specifically  designed  to  improve  on  existing  Medicare  payment  to  risk-based  health  plans  embodied 
in  the  AAPCC.  While  the  PACS  system  currently  incorporates  ambulatory  resource  use,  both 
PACS  and  DCGs  focus,  in  particular,  on  information  drawn  from  inpatient  stays  and  require 
considerable  claim  data.  These  factors  tend  to  limit  the  applicability  of  those  methods  beyond  the 
Medicare  program  since  nonMedicare  populations  use  a  different  mix  of  services,  such  as  a  higher 
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proportion  of  ambulatory  services,  and  private-sector  claim  data  sufficient  to  apply  them  are  not  yet 
widely  available. 

A  team  of  researchers  at  Johns  Hopkins  University  is  currently  working  to  develop  a  capitation 
rating  system  that  would  combine  the  ACG  and  PACS  systems  for  use  under  the  Medicare 
program.  Such  a  system  would  interleave  the  inpatient  utilization  orientation  of  the  PACS  system 
with  the  more  disaggregated  grouping  of  diagnoses  in  the  ACG  system  (Weiner  1995). 

Some  work  on  risk  adjustment  directed  at  nonMedicare  populations  has  focused  on  alternatives  to 
claim  data  as  a  source  of  information  for  risk  adjustment.  James  Robinson  et  al.  (1991)  focused  on 
data  drawn  from  an  employer's  database  that  served  as  a  proxy  for  health  status.  This  approach  is 
constrained  by  the  availability  and  comparability  of  employer  data  files.  It  also  does  not  address 
important  specialized  population  such  as  Medicaid  recipients. 

Data  gathered  from  patients.  Generally,  claim  and  medical  record-derived  data  require  some 
prior  contact  with  the  health  system.  Data  may  be  lacking  for  those  who  are  new  to  a  plan,  or  who 
have  no  previous  contact  with  the  system.  Data  may  be  incomplete  for  those  who  have  existing 
health  problems  that  are  not  reflected  in  health  plan  data  systems.  Moreover,  some  HMOs  do  not 
routinely  record  certain  data  items,  such  as  ICD-9-CM  diagnosis  codes,  that  are  integral  to  the 
types  of  adjustments  described  so  far  (Hombrook  and  Goodman  1995). 

Direct  measures  of  health  status  can  be  gathered  from  patients  themselves.  Self-reported  data  can 
provide  information  about  those  who  have  more  limited  contact  with  the  health  system.  More 
importantly,  self-reported  health  status  measures  reflect  the  context  within  which  patients 
experience  medical  problems,  rather  than  simply  the  presence  or  absence  of  disease — as  do 
diagnoses.  Self-reported  health  status  measures  are  a  means  to  summarize  the  impact  of  specific 
diseases  and  comorbidities  on  consumer  well-being  and  their  propensity  to  use  care.  Thus,  they 
embody  information  not  otherwise  available  to  health  plans  (Hornbrook  and  Goodman  1995). 
Measures  derived  directly  from  patients  can  be  describe  as  including  those  reflecting  individuals' 
subjective  perceptions  of  their  health,  instrument-based  assessments  of  patient  functioning,  and  an 
accounting  of  chronic  conditions  (Thomas  and  Lichtenstein  1986B). 

Perceived  health  status.  The  most  commonly  discussed  risk  adjuster  based  on  data  gathered 
directly  from  patients  is  some  general  measure  of  self-reported  health  status  (GAO  1994A).  That 
might  be  drawn  from  a  single  or  multiquestion  survey  through  which  patients  are  asked  to  rate  their 
perception  of  their  health  as  compared  to  other  patients  the  same  age.  Perceived  health  status  was 
included  in  15  of  42  studies  review  by  Epstein  and  Cumella  (1988).  Most  of  those  studies  were 
limited  to  a  single-question  survey  of  this  measure  and  that  review  focused  on  studies  of  the  elderly. 
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Nevertheless,  perceived  health  status  was  predictive  of  utilization,  particularly  in  ambulatory 
settings. 

More  recently,  attention  has  been  focused  on  multiquestion  surveys  such  as  versions  of  the  RAND 
36-Item  Health  Survey  (Stewart  et  al.  1989,  Ware  and  Sherboume  1992).  These  instruments 
evaluate  a  range  of  multidimensional  health  concepts  through  a  series  of  questions  regarding 
respondents'  physical  and  emotional  well-being.  Responses  to  subsets  of  the  36  questions  are 
combined  to  reflect  scales  such  as  general  health,  physical  functioning,  mental  health,  and  others. 

Two  recent  studies,  in  particular,  have  shown  that  self-reported  health  status  is  a  significant 
measure  of  risk  for  adult  populations.  Hornbrook  and  Goodman  (1995)  compared  the  performance 
of  the  RAND  survey  to  demographic  factors.  Their  study  focused  on  employer-related  enrollment 
in  an  HMO  setting.  They  found  that  models  based  on  a  subset  of  the  RAND  scales  provide 
substantially  better  prediction  performance  than  age  and  gender  alone.  That  is  to  say,  some  scales 
within  the  survey  instrument,  such  as  physical  and  social  functioning,  were  much  better  predictors 
than  others,  such  as  mental  health. 

The  Fowles  et  al.  (1994)  study  compared  the  use  of  36-item  survey-based  measures,  self-reported 
chronic  conditions  (patients  were  asked  to  report  the  existence  of  specific  chronic  conditions), 
demographic  factors,  and  claim-based  risk  adjustors  (ACGs  and  ADGs).  This  study  focused  on  an 
adult  population  that  included  Medicare  and  nonMedicare  enrollment  in  an  HMO.  The  researchers 
found  that  both  survey  and  claim-based  risk  adjustors  significantly  outperformed  demographic 
models. 

Despite  these  strong  results,  both  research  teams  noted  administrative  limitations  of  survey-based 
risk  adjustment  methods.  Survey-based  methods  tend  to  underrepresent  important  subsets  of  plan 
enrollees.  That  would  include  very  young  children  and  some  disabled  and  mentally  handicapped 
individuals  who  are  not  able  to  provide  responses  to  survey  questions,  for  example.  Thus,  survey- 
based  measures  are  not  comprehensive  in  the  sense  that  claim-based  methods  promise  to  be. 

Both  studies  also  reported  survey  response  bias.  Those  who  responded  to  the  survey  had 
significantly  higher  expenditures  per  person  in  the  base  year  of  the  study.  Whereas  Hornbrook  and 
Goodman  found  no  statistically  significant  bias  in  the  next  year  (the  target  year  for  predicting 
expenses),  Fowles  and  her  colleagues  reported  bias  in  both  years  for  non-senior  respondents. 
Those  latter  results  also  noted  that  the  pattern  of  response  bias,  generally,  differs  by  age  and 
gender.  One  important  consequence  of  such  bias  is  that,  if  not  otherwise  moderated,  payment 
expectations  derived  from  survey-based  measures  may  consistently  over  or  under-value  specific 
risk  classes. 
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Functional  health  status.  Functional,  as  opposed  to  perceived,  health  status  is  also  derived  from 
patient  response  data.  These  measures  are  typically  evaluated  in  terms  of  an  individual's  ability  to 
perform  activities  of  daily  living  (ADL)  and/or  "instrumental  activities  of  daily  living"  (IADL). 
ADL  measures  generally  reflect  personal  care  activities  such  as  feeding,  dressing,  and  bathing. 
IADL  measures  focus  on  more  complex  activities  such  as  shopping  and  housekeeping  (Thomas  and 
Lichtenstein  1986B,  GAO  1994 A).  Both  these  scales  are  typically  reported  as  a  single  composite 
score  derived  from  a  series  of  provider  or  self-administered  survey  questions.  Because  they  do  not 
rely  on  patients'  perceptions  of  their  health,  functional  health  measures  are  more  objective  and  less 
subject  to  manipulation  than  other  self-reported  measures  (Lichtenstein  and  Thomas  1987B, 
Thomas  and  Lichtenstein  1986B). 

Thomas  and  Lichtenstein  (1986 A,  1986B),  for  example,  analyzed  two  commonly  used  measures: 
the  Katz1  Index  of  Activities  of  Daily  Living  (Katz  et  al.  1963),  an  ADL  scale,  and;  the  Rosow- 
Breslau  Functional  Health  Scale  (Rosow  and  Breslau  1966),  an  IADL  measure.  They  showed  that 
both  ADL  and  IADL  measures  improved  on  basic  demographic  models,  including  the  AAPCC. 
The  IADL  scale,  in  particular,  significantly  improved  demographic  as  well  as  prior-use  models. 

Global,  or  single-question,  measures  of  functional  impairment  have  also  been  used  in  predicting 
health  care  costs  (Anderson  and  Steinberg  1985,  Lubitz  et  al.  1985).  Such  a  measure  might  reflect 
whether  an  individual  is  disabled  or  limited  in  usual  activities  or  abilities.  In  their  review  of  17 
studies  that  included  some  measure  of  functional  health  status  in  adjusting  payment  rates,  Epstein 
and  Cumella  (1988)  noted  that  single-question  global  measures  seem  to  have  more  predictive 
power  than  both  ADL  and  IADL  measures. 

Lichtenstein  and  Thomas  (1987A  &  B)  also  compared  functional  and  perceived  health  status 
measures.  While  both  types  of  measures  are  predictive  of  expenditures,  they  found  that  perceived 
health  status  is  less  stable  and  that  functional  measures  are  more  useful  for  predicting  health 
expenditures.  It  should  be  noted  that  this  work  preceded  the  general  availability  of  the  RAND 
multiquestion  survey  of  perceived  health. 

Self-reported  chronic  conditions.  The  number  of  chronic  conditions  and  the  existence  of  specific 
chronic  diseases  reported  by  health  plan  enrollees  have  also  been  used  as  risk  adjusters  (Epstein 
and  Cumella  1988,  Fowles  et  al.  1994).  As  was  the  case  with  diagnosis-derived  clinical  indicators, 
proponents  of  these  measures  note  that  people  with  chronic  health  problems  are  more  vulnerable 
and,  thus,  more  likely  to  use  the  health  care  system  than  those  with  more  limited  health  care  needs 
(Thomas  and  Lichtenstein  1986B). 

Epstein  and  Cumella  (1988)  reported  that  clinical  indicators,  generally,  were  predictive  of  health 
costs — both  when  self-reported  and  when  derived  from  clinical  diagnoses.  The  1994  study  by 
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Fowles  and  her  colleagues  showed  that  self-reported  chronic  conditions  combined  with  age  and 
gender  performed  as  well,  or  better,  than  models  based  on  the  RAND  questionnaire. 

The  primary  advantages  of  data  gathered  from  patients  are:  they  are  predictive  of  future  health 
service  use;  they  are  a  direct  measure  of  health  status  reflecting  the  context  of  patient 
decisionmaking  in  seeking  care;  and,  they  are  a  source  of  information  where  other  sources,  such  as 
diagnoses,  may  be  lacking.  Bias  is,  perhaps,  the  thorniest  limitation  of  such  data  (Fowles  et  al. 
1994).  The  nature  and  extent  of  nonresponse  can  undermine  the  reliability  of  rating  estimates. 
Administrative  feasibility  is  a  problem  because  special  data  collection  efforts  are  needed.  Claim  or 
encounter-based  data,  by  contrast,  are  already  routinely  available  in  many  health  plan  settings. 
Administrative  data  sources  also  tend  to  encompass  the  entire  patient  population,  rather  than 
simply  those  who  are  capable  of  responding  to  a  survey  in  some  form.  Finally,  self-reported  data 
may  be  subject  to  manipulation  and  are  particularly  difficult  to  audit. 

As  noted  throughout  this  report,  there  is  no  simple  consensus  regarding  which  form  of  risk 
adjustment  is  most  appropriate  and  feasible  to  apply  for  payment  purposes  given  the  collection  of 
constraints  posed  by  the  criteria  listed  in  Table  2.1.  While  many  of  the  specific  methods  discussed 
in  this  section  are  being  refined  on  an  ongoing  basis  in  order  to  achieve  such  consensus,  most 
researchers  and  policymakers  agree  with  the  sentiment  expressed  by  a  work  group  convened  by  the 
American  Academy  of  Actuaries  that,  "...no  risk  adjustment  approach  has  been  sufficiently  tested 
in  regard  to  accuracy,  administrative  efficiency,  implementation  issues,  or  expense  to  warrant  its 
recommendation  at  this  time  as  the  best  long-term  approach  (1993,  pg.  2)."  Consequently, 
reinsurance,  rather  than  (or  in  combination  with)  risk  adjustment,  has  been  proposed  as  a  means  to 
moderate  health  plan  risk — at  least  as  an  interim  policy — as  more  sophisticated  risk  adjustment 
techniques  are  developed. 

REINSURANCE 

Reinsurance  is  an  arrangement  between  insurers  to  distribute  financial  risk  in  ways  that  support  and 
enhance  the  activity  of  insurers.  It  involves  a  formal  contractual  arrangement  whereby  a  primary 
carrier  shifts  some  portion  of  an  insurance  risk  to  a  reinsurer.  Because  reinsurance  simply  shifts 
financial  risk  among  insurers,  it  does  not  change  the  nature  of  the  risk  itself.  It  is  a  means  to 
control  the  severity,  rather  than  the  frequency,  of  financial  losses  associated  with  a  given  risk 
(Kramer  1980). 

Functions  of  Reinsurance 

Reinsurance  is  typically  described  as  having  three  basic  functions.  The  first  is  to  improve  the 
underwriting  capacity  of  an  insurer.  Underwriting  capacity  is  the  maximum  amount  of  money  an 
insurer  can  reasonably  risk.  That  amount  is  a  function  of  the  actuarially  determined  expectation  for 
particular  risks,  the  variability  of  those  expectations,  and  the  financial  resources  of  the  insurer  to 
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cover  unusual  or  catastrophic  loss  behavior.  Reinsurance,  in  effect,  raises  the  nominal  level  of  risk 
an  insurer  can  assume  while  providing  a  means  to  limit  (or  control)  the  financial  loss  the  insurer  is 
likely  to  experience  (Baker  1980,  Bovbjerg  1992,  Kramer  1980). 

Reinsurance  is  also  a  means  to  stabilize  swings  in  insurers'  profits  and  losses  due  to  the  uncertainty 
of  any  given  risk.  An  insurer  can  decide  the  maximum  amount  of  loss  he  or  she  is  willing  to  bear 
on  each  insured  risk,  and  reinsure  the  remainder.  Setting  limits  on  risk  in  this  way  effectively 
reduces  the  variance  an  insurer  can  expect  between  budgeted  and  actual  liability  (Baker  1980, 
Kramer  1980).  The  White  House  Task  Force  on  Health  Risk  Pooling  (1993)  suggested  that  this 
was  the  primary  reason  for  including  stoploss  provisions  under  health  reform.  Reinsurance  makes 
the  outcomes  per  person  more  dense,  and  thus  more  predictable,  by  moderating  the  effects  of 
outcomes  that  are  much  greater  than  average  for  any  given  risk  category.  Reinsurance  regulation  is 
also  used  at  the  state  level  to  allocate  the  burden  of  unusually  high  risks  across  insurers.  Thus,  it  is 
a  means  to  extend  the  private  insurance  market  into  otherwise  unattractive  areas  such  as  individual 
and  small-group  markets  (Baker  1980,  Bovbjerg  1992). 

Reinsurance  can  also  be  used  to  strengthen  the  financial  structure  of  an  insurer's  business.  Pro  rata, 
or  proportional,  forms  of  coverage,  for  example,  involve  the  primary  carrier  ceding  risk  to  a 
reinsurer  such  that  all  premium  income  and  losses  are  shared  between  both  parties.  Under  this  type 
of  coverage,  in  addition  to  increasing  the  underwriting  capacity  of  the  primary  carrier,  the  reinsurer 
acts  as  a  kind  of  pseudo  financial  partner  (Baker  1980,  Tucker  1986).  Along  similar  lines,  insurers 
may  use  this  form  of  reinsurance  as  a  means  to  acquire  technical  assistance  or  expertise,  especially 
when  approaching  new  business  involving  unknown  or  very  specialized  risk  (Bovbjerg  1992, 
Ferguson  1980). 

Types  of  Reinsurance:  Non-proportional 

Reinsurance  can  be  described  as  taking  proportional  and  non-proportional  forms.  Non- 
proportional,  or  excess-of-loss,  reinsurance  is  used  to  limit  loss  rather  than  to  share  risk.  Under 
these  arrangements,  the  primary  insurance  carrier  retains  all  losses  up  to  some  predetermined 
"stoploss"  (or  deductible)  amount,  and  then  recovers  some  or  all  of  the  excess  losses  from  the 
reinsurer.  Risk  can  be  limited  in  this  way  on  a  per-risk  or  aggregate  basis.  However,  aggregate 
forms,  which  cover  an  insurer's  full  book-of-business,  are  not  commonly  available  to  health  plans 
because  they  tend  to  undermine  incentives  for  cost  containment  (Bovbjerg  1992,  Tucker  1986). 
Where  payment  for  pro  rata  agreements  involves  unlimited  sharing  in  premium  income,  the  primary 
insurer  pays  the  reinsurer  a  set  premium  per  risk  for  stoploss  coverage. 

Per-risk  agreements  are  commonly  used  by  health  plans  to  avoid  catastrophic  medical  benefit  costs 
(Bovbjerg  1992,  Morey  1991,  Ward  1993).  Under  such  an  agreement,  a  health  plan  might,  for 
example,  limit  the  loss  they  are  willing  to  assume  to  the  first  $50,000  associated  with  any  given 
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individual.  They  might  also  pay  a  copayment  for  costs  above  the  stoploss  threshold.  The  loss  limit 
defined  may  cover  hospital-based  expenses  only,  or  both  hospital  and  physician-related  costs 
depending  on  the  needs  of  the  plan.  Stoploss  coverage  limited  to  physician  service  costs  is  rare 
because  of  the  relatively  low  dollar  value  of  that  risk.  However,  plans  that  establish  risk  sharing 
arrangements  with  their  providers  may  subsume  coverage  of  those  costs  in  an  overall  stoploss 
policy  and  then  make  that  coverage  available  to  the  providers  in  return  for  a  fee  in  the  form  of  an 
offset  to  their  capitation  amount  (Morey  1991). 

These  (per  risk)  agreements  may  also  include  a  copayment  for  costs  above  the  stoploss  threshold. 
A  copayment  is  intended  to  ensure  that  health  plans  retain  some  financial  interest  to  provide 
required  services  efficiently  regardless  of  having  reached  that  threshold  in  any  given  case. 
Copayments  are  more  likely  to  be  deemed  necessary  as  the  threshold  level  is  lower  (AAA  1993, 
Bovbjerg  1992,  Tucker  1986). 

With  an  excess-of-loss  agreement,  insurers  tend  to  retain  a  higher  level  of  risk  (and  profits),  than  is 
the  case  with  proportional  agreements,  with  the  knowledge  that  there  is  protection  at  the  high  end 
of  losses  (Bovbjerg  1992,  Ferguson  1980,  Tucker  1986).  One  advantage  of  this  type  of  coverage 
is  that  the  insurer  can  get  substantial  risk  protection,  and  attendant  stability  of  risk,  for  a  relatively 
modest  premium  outlay.  Cash  flow  can  be  minimized  between  the  primary  carrier  and  the  reinsurer 
where  the  stoploss  level  is  sufficiently  high  (Ferguson  1980). 

Aside  from  their  use  in  the  private  sector,  government  sponsored  (or  regulated)  stoploss 
reinsurance  programs  have  been  proposed  as  part  of  both  state  and  federal  health  reform  programs 
(Beebe  1992,  Schramm  1992).  The  state  of  Minnesota,  for  example,  expects  to  use  a  reinsurance 
pool  on  a  mandatory  basis  to  pay  for  catastrophic  cases  under  its  recent  health  system  reform 
(MDH  1993).  New  York  State  provides  reinsurance  for  hospital  expenses  at  a  $50,000  stoploss 
level  to  plans  that  do  not  otherwise  show  themselves  capable  of  sustaining  such  risk  under  its 
current  Medicaid  Managed  Care  Program  (NYSDH  1995).  This  approach  to  reinsurance  coverage 
is  considered  relatively  easy  to  implement.  It  also  tends  to  spread  the  cost  of  high-risk  cases 
broadly  across  plans.  Some  researchers  have  noted  that  this  type  of  program  should  be  mandatory 
for  all  participating  carriers  in  order  to  avoid  the  selection  of  only  inefficient  organizations  into  the 
program,  and  to  spread  the  costs  of  the  program  widely.  Systemwide  participation  would  also 
reduce  the  associated  premium  cost,  which  would  then  be  based  on  the  expected  cost  of  all  high- 
cost  claims  in  the  system  (Dunn  et  al.  1995,  PPRC  1994). 

The  level  of  the  stoploss  amount  and  the  extent  to  which  coverage  above  that  amount  are  shared 
(through  coinsurance)  are  key  considerations  to  make  in  defining  a  reinsurance  program.  These 
two  factors  represent  a  tradeoff  between  protecting  plans  from  risk  and  providing  an  incentive  to 
manage  care  well.  Among  reinsurers,  stoploss  coverage  at  lower  threshold  amounts  really 
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constitutes  a  "transfer  of  payments"  arrangement  since  the  bulk  of  the  risk  is  removed  in  favor  of 
higher  administrative  costs  to  process  claims  (Morey  1991,  Tucker  1986).  Bovbjerg  (1992)  noted 
that  HMOs  commonly  retain  risk  at  a  level  between  $25,000  and  $75,000  per  covered  life.  Low- 
threshold  amounts  of  $5,000  have  been  suggested  as  a  part  of  small-group  market  reform  (AAA 
1994C).  The  state  of  Maryland  planned  such  a  level  of  coverage  under  its  statewide  reform 
(Anderson  et  al.  1993). 

Coinsurance  levels  are  less  clearly  defined  in  practice,  however.  Bovbjerg  (1992)  suggests  they  are 
not  common  at  higher  stoploss  levels.  Reinsurers  interested  in  providing  coverage  under  a 
proposed  Medicare  demonstration  suggested  10  to  20  percent  coinsurance  would  be  appropriate  for 
hospital-based  services  (Tucker  1986).  New  York  State  requires  a  15  percent  coinsurance  rate  with 
the  stoploss  at  $50,000  under  its  Medicaid  Managed  Care  Program  (NYSDH  1995).  A  survey  of 
HMOs  showed  that  plans  generally  pay  a  coinsurance  rate  on  the  order  of  10  to  20  percent  of  costs 
above  the  stoploss  amount  (Milliman  and  Robertson  1991).  In  any  case,  lower  stoploss  thresholds 
are  likely  to  require  coinsurance  amounts  in  order  to  maintain  plan  incentives  to  manage  care 
effectively  and  efficiently.  The  AAA  used  a  10  percent  coinsurance  rate  with  a  $5,000  stoploss  in 
scenarios  examining  state-level  reforms  (AAA  1993). 

Types  of  Reinsurance:  Proportional 

Proportional,  or  pro  rata  reinsurance,  "describes  a  reinsurance  plan,  where  in  return  for  a 
predetermined  proportion  or  share  of  the  insurance  premium,  the  reinsurer  pays  a  predetermined 
proportion  or  share  of  the  loss  plus  allocated  loss  adjustment  expenses"  (Ferguson  1980,  pg.  52). 
The  primary  purpose  of  pro  rata  reinsurance  is  risk  sharing.  "Quota  share"  arrangements  cover  all 
claims  in  the  same  proportion,  while  "surplus  share"  arrangements  allow  for  variation  in  the 
proportion  of  risk  shared  depending  on  the  type  of  risk.  In  both  cases  the  risk  sharing  is  on  an 
ongoing  basis,  and  the  reinsurer  participates  in  covering  any  associated  losses  without  regard  to  the 
actual  frequency  and  severity  of  those  losses.  Insurers  realize  a  profit  by  retaining  the 
administrative  function  to  provide  coverage  while  some  level  of  the  associated  risk  is  passed  on  to, 
or  shared  by,  the  reinsurer  (Ferguson  1980,  Tucker  1986). 

What  is  sometimes  referred  to  as  "prospective  reinsurance"  might  be  considered  a  form  of  surplus 
share  coverage.  Prospective  reinsurance  covers  health  service  charges  for  designated  individuals  or 
groups  (PPRC  1994).  Typically,  those  who  are  covered  under  this  arrangement  constitute  what 
would  otherwise  be  considered  high  risks,  such  as  small  groups  or  individuals.  Such  a  system  was 
enacted  in  Connecticut  in  1990.  There,  primary  insurers  pay  a  premium  on  behalf  of  individuals  or 
groups  they  identify,  in  advance,  as  high  risk  to  a  statewide  reinsurance  pool.  The  pool  then  pays 
for  all  losses  for  those  reinsured.  Because  reinsurance  losses  are  projected  to  exceed  premium 
income,  all  primary  carriers  are  assessed  a  pro  rata  share  of  any  excess  losses  (Bovbjerg  1992). 
The  Health  Insurance  Association  of  America  and  the  National  Association  of  Insurance 
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Commissioners  have  recommended  a  similar  approach  within  the  context  of  small-group  insurance 
market  reform  (Schramm  1992,  Bovbjerg  1992). 

New  York  State  has  established  two  risk  sharing  pools.  One  operates  much  like  a  risk  adjustment 
mechanism  in  that  some  accounting  is  made  of  age  and  gender  differences  across  small  group  and 
individual  lines  of  business.  The  second  pool  provides  for  risk  limitation  on  a  prospective  basis. 
Insurers  contribute  to  the  pool  based  on  covered  lives — specifically  identified  for  coverage  in 
advance — and  receive  specified  payments  for  designated  high-cost  medical  conditions  (NYDI 
1992,  PPRC  1994).  Those  payments  reflect  the  average  expected  costs  associated  with  efficiently 
providing  care  for  those  diagnoses.  Where  the  Connecticut  approach  diffuses  the  incentive  to 
provide  care  efficiently  across  participating  carriers,  the  New  York  model  retains  that  incentive  in 
its  defined  payment. 

As  noted  in  the  preceding  section,  pro  rata  reinsurance  is  an  effective  way  to  help  a  primary  insurer 
enter  a  new  class  of  business  or  a  new  area.  It  can  be  used  to  acquire  technical  assistance  by 
tapping  into  a  reinsurer's  expertise  in  specific  areas.  This  form  of  reinsurance  can  also  be  used  to 
limit  underwriting  risk  to  a  level  that  is  consistent  with  a  company's  financial  resources  and 
aversion  to  risk. 

On  the  other  hand,  pro  rata  coverage  has  some  marked  disadvantages,  particularly  in  the  health 
services  market.  It  can  be  the  most  expensive  form  of  reinsurance.  Under  proportional  forms  of 
reinsurance  coverage,  the  reinsurer  is,  by  definition,  vested  in  each  claim.  Unlike  life  insurance, 
which  generates  one  claim  per  covered  risk,  health  insurance  involves  ongoing  contact  between  the 
insurer  and  the  insured.  Proportional  sharing  of  risk  can  involve  a  considerable  amount  of 
administrative  processing  in  the  review  of  service  activity  (Ferguson  1980). 

More  broadly,  reinsurance  is  traditionally  an  enterprise  among  insurers.  Health  plans  such  as 
HMOs  act  as  both  provider  and  insurer.  Private  reinsurers  with  a  proportional  interest  in  a  health 
plan  would  have  to  share  in,  or  at  least  be  comfortable  with,  the  business  of  providing  and 
managing  services  as  well  as  limiting  risk.  Reinsurers  tend  to  want  to  stick  to  the  business  of 
insurance  rather  than  get  involved  in  the  business  of  providing  services.  Thus,  proportional 
reinsurance  has  rarely  been  used  in  the  health  care  market  (Bovbjerg  1992). 

Partial  capitation.  A  hybrid  reimbursement  method  that  can  be  described  as  a  variation  of 
proportional  reinsurance  has  been  proposed  for  payment  purposes  under  government  health 
programs.  Newhouse  (1986,  1994)  has  suggested  "partial  capitation"  whereby  a  significant 
proportion  of  payments  to  plans  is  tied  to  actual  charges.  He  suggests  that  this  approach  would 
combine  the  advantages  inherent  in  capitation  (e.g.,  the  incentive  for  cost  efficiency)  with  a  plan's 
need  to  recover  the  actual  costs  of  services,  thus  reducing  the  incentive  for  risk  selection  based  on 
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high-cost  cases.  Plans  would  receive  a  percentage  (e.g.,  60  percent)  of  risk-adjusted  projected  costs 
for  each  individual  as  a  capitation  amount.  They  would  then  receive  a  retrospective  reimbursement 
for  some  percentage  (e.g.,  40  percent)  of  actual  charges  related  to  their  enrolled  population  on  a 
periodic  basis.  Partial  capitation,  then,  represents  a  trade  off  between  the  market  efficiency  of 
capitation  and  concern  for  equity  involved  in  covering  costs  in  a  regulated  market — in  much  the 
same  way  that  business  autonomy  is  sacrificed  for  risk  limitation  under  traditional  forms  of 
proportional  reinsurance. 

It  is  important  to  note  that  partial  capitation  has  not  been  formally  described  as  a  form  of 
proportional  reinsurance  in  the  past.  The  general  tone  of  its  definition  is,  however,  in  keeping  with 
pro  rata  coverage.  Newhouse  et  al.  (1989)  have  noted  that,  given  the  state-of-the-art  of  risk 
adjustment,  prospectively  set  payment  rates  are  not  likely  to  curb  risk  selection,  and  that  some  form 
of  retrospective  adjustment  in  payment  rates  will  be  required.  In  this  case,  a  governmental  entity, 
or  entities,  charged  to  administer  payment  to  health  plans  would  act  as  a  de  facto  reinsurer.  That 
entity  would  be  vested,  in  a  business  sense,  in  the  operation  of  each  health  plan.  This  could  have 
significant  implications  for  health  plan  autonomy  regarding  plan  operations  and  the  administration 
of  service-related  data  for  purposes  of  reconciling  actual  costs. 

METHODOLOGICAL  &  STATISTICAL  ISSUES 

The  process  of  risk  adjustment  used  to  set  capitation  payment  rates  involves,  first,  classifying 
individuals  in  some  way  consistent  with  the  kind  of  adjustment  criteria  and  methods  described 
above  and  then  calculating  "expected"  values  based  on  that  classification.  Data  on  the  costs  and 
utilization  of  covered  services  from  some  base  experience  period,  for  some  reference  population, 
are  compiled  to  reflect  each  class  of  individuals.  For  example,  those  data  would  be  distributed  by 
age  and  gender  for  simple  demographic  models,  or  by  ACG  category  if  the  ACG  system  is  used. 

Data  on  costs  and  utilization  may  be  derived  through  actuarial  analyses — where  cost  center 
estimates,  such  as  inpatient  charges,  expressed  in  terms  of  usage  per  some  unit  of  members  (e.g., 
per  1000  members)  are  then  adjusted  for  risk  classes  based  on  actuarial  tables.  They  may  also  be 
derived  from  claim  or  encounter  data  that  more  directly  associate  service  use  and  cost  experience 
with  individual  plan  members.  In  either  case,  adjustments  are  made  to  reflect  any  anticipated 
differences  between  the  experience  period  and  the  rating  period  (i.e.,  the  period  for  which  rates  will 
be  set)  Those  adjustments  typically  reflect  projected  changes  in  benefit  structure  or  underwriting 
policy,  efficiencies  derived  from  utilization  management,  monetary  inflation,  and  changes  in 
enrollment,  as  well  as  revenue  targets  set  for  the  plan.  Multiple  years  of  experience  might  also  be 
used  to  minimize  fluctuations  in  year-to-year  claim  experience  (Herrle  1993,  Milliman  and 
Robertson  1990,  Sutton  and  Sorbo  1993). 
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One  adjustment  that  is  especially  germane  to  this  discussion  is  that  made  to  account  for  reinsurance 
alternatives  employed  by  the  health  plan.  In  the  case  of  condition-specific  alternatives,  where  the 
costs  associated  with  certain  diagnoses  are  covered  through  a  separate  pool  of  funds,  the  costs 
related  to  the  treatment  of  those  conditions  would  be  removed  so  that  those  data  would  not  be 
included  in  the  subsequent  calculation  of  basic  prospective  rates.  In  the  case  of  stoploss 
alternatives,  the  individual-level  service  cost  data  would  be  truncated  to  reflect  the  level  of  the 
stoploss,  except  for  any  costs  associated  with  coinsurance  above  the  stoploss  level. 

Once  all  preliminary  adjustments  are  made,  the  resultant  cost  estimates  are  then  reduced  to  a  per- 
member  expectation  of  future  costs  that  can  be  associated  with  each  class  of  individual.  Those 
values  can,  in  turn,  be  used  as  the  basis  for  premium  values  or  capitation  payments.  For  risk-based 
payments  under  the  federal  Medicare  program,  for  example,  actual  Medicare  claim  data  are  used  to 
establish  national  per-member  cost  estimates  for  four  classes  of  payment — Part  A  (hospital)  and 
Part  B  (physician)  expenditures  for  the  aged  and  the  disabled.  Separate  Part  A  and  Part  B 
calculations  are  made  for  people  with  end-stage  renal  disease  at  the  state  level.  These  rates  are 
projected  forward  to  reflect  anticipated  inflation  rates,  changes  in  utilization  patterns,  and 
programmatic  changes.  A  county  level  geographic  adjustment  is  applied  based  on  a  rolling  five 
year  average  of  each  county's  per  capita  experience  relative  to  the  national  average.  Medicare's 
payment  rates  to  health  plans  are  then  adjusted  for  the  age,  gender,  institutional  and  welfare  status 
of  beneficiaries  that  actually  enroll  in  any  given  plan  (Palsbo  1988,  1989). 

Simulation  of  the  application  of  the  type  of  risk  adjustment  methods  discussed  above  are  typically 
based  on  claim  or  encounter-level  data  that  make  it  possible  to  associate  both  costs  and  risk  criteria 
with  specific  individuals  (Anderson  et  al.  1990,  Dunn  et  al.  1995,  Fowles  et  al.  1994,  Hayes  1991, 
Hombrook  et  al.  1991,  Hornbrook  and  Goodman  1995,  Newhouse  et  al.  1993,  Robinson  et  al. 
1991,  Rossiter  et  al.  1994),  although  actuarial  methods  can  and  have  been  used  (Hayes  1991).  One 
advantage  of  individual-level  data  used  in  the  process  of  risk  adjustment  is  that  more  sophisticated 
rating  criteria  can  be  incorporated  into  the  risk  adjustment  model  than  might  otherwise  be 
administratively  feasible  using  traditional  actuarial  or  cost  center  approaches.  ADGs,  for  example, 
reflect  nonexclusive  diagnostic  categories.  Incorporating  them  in  a  risk  adjustment  model  may 
result  in  hundreds  of  risk  categories  that  could  not  reasonably  be  considered  using  actuarial 
methods  alone. 

Regardless  of  the  number  of  potential  risk  classifications,  statistical  techniques  such  as  ordinary 
least  squares  regression  applied  to  individual-level  data  make  it  possible  to  estimate  parameters 
associated  with  each  risk  criterion.  Robinson  et  al.  (1991)  and  Rossiter  et  al.  (1994)  used  multi- 
part regression  techniques  suggested  by  Duan  et  al.  (1982)  where  both  the  costs  and  the  risk  criteria 
used  to  establish  expectations  were  drawn  from  the  same  period.  The  resultant  expectations  were 
adjusted  for  inflation  in  procedure  prices  and  volume,  and  then  applied  to  individuals  who  were 
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covered  during  the  rating  period.  This  is  essentially  analogous  to  the  Medicare  formulation  just 
described  as  well  as  traditional  actuarial  and  cost  center  based  methods.  Expectations  are 
established  for  each  risk  class  and  then  applied  to  the  population  of  individuals  who  are  active 
during  the  rating  period. 

Alternatively,  Hornbrook  and  Goodman  (1995)  established  charge  expectations  more  directly  for 
the  purposes  of  simulating  and  assessing  risk  adjustment  methods  by  modeling  expenses  in  the 
rating  period  as  a  function  of  risk  factors  identified  for  a  preceding  base  experience  period.  The 
principal  advantage  of  this  approach  is  that  more  precise  transition  relationships  between  risk 
criteria  and  future  costs  can  be  made.  For  academic  purposes,  this  "transition"  approach  removes 
the  need  to  make  estimates  of  year-to-year  adjustments  that  might  otherwise  introduce  a  measure  of 
uncertainty  (or  bias)  unrelated  to  the  application  of  a  risk  adjustment  method  per  se.  This  approach 
may  overstate  the  extent  to  which  any  given  risk  adjustment  method  may  predict  future 
expenditures,  however,  if  the  same  population  that  is  used  to  estimate  payment  expectations  is  used 
to  assess  those  expectations.  Split-population  techniques — where  risk  class  expectations  are 
estimated  from  a  subsample  of  a  study  population  and  then  applied  for  validation  purposes  to  the 
remaining  "half  of  the  population — are  typically  used  to  address  such  "overfitting."  In  actual 
practice,  as  opposed  to  the  more  limited  purposes  of  simulation,  any  prospectively  set  rates, 
including  those  based  on  transition  relationships  across  earlier  periods,  would  require  some 
adjustment  for  projected  differences  between  the  experience  period  and  the  rating  period. 

Both  same-period  and  transition-based  expectations  can  be  derived  by  regressing  a  dependent 
measure  (e.g.,  costs  related  to  covered  services)  on  independent  measures  reflecting  risk  adjustment 
criteria.  The  relationship  of  interest  in  analyses  based  on  transition  effects  can  be  represented  in 
statistical  notation  as: 

Costs,.  =  a  +  bR7 ,  +  cR2it.u  + ...  +  e,_7>i 

where:  a  is  an  intercept  term  common  to  all  individuals;  b,  c,  etc.  are  risk-specific  parameters 
generated  by  the  model,  and  Rj  t_]  j  are  risk  factors  in  year  t-1  for  the  i-th  individual.  Risk  criteria 
established  for  the  base  period  (t-1)  are  associated  with  service  cost  experience  in  the  subsequent 
(rating)  period.  Risk-specific  parameters  (b,  c,  etc.)  are  estimated  using  a  subsample  of  individuals 
(the  estimation  sample)  from  the  study  population.  Those  parameters  are  then  applied  to  the 
remaining  "half  (the  validation  sample)  of  the  study  population  to  establish  their  payment  rates. 
Same-period  analyses  can  be  represented  in  this  notation  by  substituting  t  for  the  t-1  associated 
with  each  risk  factor  (Rj). 

It  is  a  tenet  of  the  statistics  that  underlie  rating  expectations  that  the  data  used  to  establish  expected 
values  must  reflect  the  population  and  context — the  "system" — within  which  they  are  applied  (Box 
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1966,  Shwartz  and  Ash  1994).  Thus,  the  predictive  value  of  any  given  set  of  risk  adjusted 
expectations  is  a  function  of  both  the  extent  to  which  the  underlying  method  of  adjustment  explains 
the  distribution  of  (in  this  case)  costs  and  the  extent  to  which  the  estimation  population  and 
environment  serve  as  a  model  for  the  rating  population  and  its  health  service  environment.  In  other 
words,  the  predictive  value  of  a  given  application  of  risk  adjustment  may  be  a  function  of  the 
method  used  to  set  rates  as  well  as  unaccounted-for  differences  between  the  population  used  to 
estimate  rates  and  the  population  to  which  those  rates  are  applied. 

More  generally,  any  discrepancy  between  expected  and  observed  outcomes  can  be  broadly  defined 
as  "bias"  associated  with  the  process  of  risk  adjustment  as  a  whole  (Shwartz  and  Ash  1994,  Ash 
and  Shwartz  1994).  Overfitting  data  in  the  estimation  of  payment  rates,  for  example,  may 
introduce  a  form  of  bias  that  can  be  directly  ascribed  to  differences  between  estimation  and  rating 
populations,  and  lead  to  exaggerated  expectations  for  a  given  model's  predictive  performance. 
Administrative  or  structural  factors  that  were  noted  above  may  also  be  sources  of  bias.  Those 
include,  but  are  not  limited  to,  response  bias  associated  with  data  gathered  from  patients, 
limitations  of  underlying  data  sources,  and  the  host  of  adjustments  that  might  typically  be  made 
when  traditional  actuarial  methods  are  used  to  set  prospective  payment  rates. 

Two  sources  of  bias  that  are  particularly  relevant  to  this  study  are  related  to  selection  effects  and 
the  underlying  distribution  of  service  costs.  The  first,  bias  due  to  selection  effects,  provides  an 
overarching  rationale  for  the  need  for  risk  adjustment  in  that  both  patients  and  providers  make  what 
are  assumed  to  be  nonrandom  choices  regarding  the  provision  of  services.  Risk  adjustment  is 
intended  to  account  for  factors  that  are  known  to  be  associated  with  such  effects.  The  second  is 
bias  that  may  be  associated  with  the  underlying  distribution  of  data  used  to  calculate  risk  adjusted 
expectations.  Assumptions  regarding  the  distribution  of  service  cost  data  will  be  discussed  more 
fully  later  in  this  chapter.  Regardless  of  its  source,  identifying  and  interpreting  the  effects  of 
various  sources  of  bias  may  seriously  complicate  the  application  and  assessment  of  risk  adjustment 
alternatives  used  to  set  prospective  payment  rates  (Iezzoni  and  Greenberg  1994).  One  important 
challenge  for  this  study  will  be  to  find  a  way  to  isolate  and  examine  specific  sources  of 
discrepancies  that  do  emerge  in  the  application  of  risk  adjusted  expectations. 

Independent  Measures 

The  type  of  modeling  and  simulation  planned  for  this  study  requires  some  consideration  of  how 
both  the  risk  factors  and  the  dependent  measure  chosen  for  analysis  will  be  treated.  Specific  risk 
criteria  (or  factors)  that  underlie  the  risk  adjustment  methods  used  to  set  payment  rates  are  reflected 
in  the  kind  of  formulation  just  described  as  independent  measures.  A  very  simple  demographic 
model,  for  example,  might  include  just  age  and  gender.  Age  could  be  represented  as  a  continuous 
variable,  in  which  case  the  chronological  age  of  each  individual  would  be  entered  in  the  model. 
Gender  would  be  entered  as  a  binary  variable,  or  flag,  where  all  individuals  of  one  gender  are  coded 
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as  "1".  All  individuals  of  the  other  gender  are  coded  as  "0".  Alternatively,  age  could  be 
represented  as  a  categorical  variable  reflecting  defined  age  classes.  In  that  case,  it  would  be  entered 
into  a  regression  model  as  a  series  of  binary  flags,  one  for  each  age  class.  Each  individual  would  be 
coded  as  "  1"  for  their  respective  age  category  and  "0"  for  all  other  categories.  Gender  would  be 
entered  just  as  it  was  in  the  "continuous"  age  model. 

It  might  be  noted  in  passing  that  age  and  gender  hold  special  significance  in  the  definition  and 
assessment  of  risk  adjustment  methods  because  of  their  historical  acceptance  and  use.  Together, 
they  provide  a  minimal  baseline  for  the  consideration  of  other  methods.  In  addition,  most  studies 
that  assess  other  risk  adjustment  methods  typically  include  some  characterization  of  age  and  gender 
when  defining  the  application  of  those  methods,  if  they  are  not  explicitly  included  in  the  underlying 
classification  (Dunn  et  al.  1995,  Fowles  et  al.  1994,  Hornbrook  and  Goodman  1995,  Weiner  et  al. 
1994). 

Individual-level  data  reflecting  the  independent  risk  factors,  however  they  are  defined,  and  the 
dependent  measure  associated  with  those  individuals,  are  then  used  to  make  the  underlying 
regression  calculations.  An  intercept  term  and  risk-specific  parameters  (notation  a,  b,  c,  etc., 
above)  are  generated  such  that  the  sum  of  those  parameters  multiplied  by  the  respective  risk  factors 
associated  with  any  given  individual  is  the  risk-adjusted  expectation  (for  the  dependent  measure) 
for  that  individual.  Using  the  continuous  age  model  and  notation  described  above,  and  total  health 
service  costs  as  the  dependent  measure,  the  age/gender-adjusted  cost  expectation  for  any  individual 
would  be:  a  +  (b*age)  +  (c*gender).  The  expectation  for  a  30-year-old  female  would  be  calculated 
by  adding  a,  b  times  30,  and  c  times  1  (where  females  are  defined  as  "1"). 

The  categorical  versus  continuous  characterization  of  age  in  risk  adjustment  models  is  related  to  a 
more  general  analytical  issue  involved  in  the  definition  of  those  models.  Shwartz  and  Ash  (1994) 
outline  a  theoretical  basis  for  assuming  that  risk  adjustment  methods  based  on  categorical  variables 
may  be  better  able  to  capture  nonlinear  relationships  among  levels  of  risk  than  those  based  on 
continuous  variables.  Thus,  they  suggest  category-based  risk  adjustment  methods  may  perform 
relatively  better  when  outliers  are  included  in  the  analysis  than  when  they  are  excluded.  Risk 
adjustment  systems  based  on  continuous  variables  are  likely  to  perform  less  well  when  extreme 
outlier  cases  are  included.  With  respect  to  the  focus  of  this  study,  the  effects  of  outliers  are 
increasingly  minimized  at  progressively  lower  thresholds  associated  with  stoploss  reinsurance 
levels. 

This  issue  of  discrete  versus  continuous  definitions  of  risk  adjustment  criteria  is  also  interesting 
from  a  practical  standpoint  in  that  the  results  of  actuarial  analyses,  which  underlies  much  existing 
rate  formulation,  are  typically  presented  in  the  form  of  a  rate  book  reflecting  discrete  categories  of 
payment  (Milliman  and  Robertson  1990,  Sutton  and  Sorbo  1993).  The  actuaries  responsible  for 
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Medicare  payments  reduce  their  calculations  to  some  30  discrete  rate  cells  which  may  be  more 
intuitively  useful  to  providers  and  consumers.  Alternatively,  Hornbrook  and  Goodman  (1995)  and 
Horabrook  et  al.  (1991)  defined  a  continuous  age  variable  and  interaction  terms  in  their  age  and 
gender-based  demographic  models  because  they  better  address  the  character  of  cost  data  used  in 
setting  rates. 

The  characterization  of  other  risk  adjustment  factors  can  present  similar  issues.  For  example,  both 
ACGs  and  ADGs  are  derived  from  the  Johns  Hopkins  ACG  system,  though  the  former  is  a  specific 
categorization  of  the  later.  ACGs  are  mutually-exclusive  categories  that  might  easily  be  presented 
in  ratebook  form.  ADGs,  on  the  other  hand,  are  nonmutually-exclusive  diagnostic  flags  that 
constitute,  essentially,  a  continuous  32-bit  binary  number  when  included  in  regression  analyses. 
Theoretically,  the  relative  performance  of  ACGs  and  ADGs  may  differ  with  the  extent  to  which 
outliers  are  included  in  the  analysis.  The  relative  performance  of  ACGs  may  be  reduced  at  lower 
stoploss  levels  because  the  number  and  effects  of  outliers  will  decrease  at  some  rate  consistent  with 
the  stoploss  level.  Despite  these  theoretical  expectations,  other  research  has  shown  that  no 
consistent  pattern  emerges  concerning  the  effect  of  trimming  the  data  on  the  R2  associated  with 
different  types  of  risk  adjustment  models  (Thomas  and  Ashcraft  1991). 

Dependent  Measures 

Parameters  b,  c,  etc.  reflect  the  relationship  between  specific  risk  factors  and  some  dependent 
measure.  They  can  be  viewed  as  estimates  of  the  relative  contribution  each  factor  makes  to  the 
model.  In  a  general  sense,  the  individual-level  expectation  that  is  derived  from  such  a  model  is  the 
average  of  the  dependent  measure  for  all  individuals  included  in  the  underlying  calculation  of  the 
model  that  exhibit  the  same  set  of  risk  factors.  Given  the  simple  continuous  age  model  using  a 
dependent  measure  of  total  health  service  costs  described  above,  the  expectation  for  a  30-year-old 
female  is  simply  the  average  of  the  costs  associated  with  30-year-old  females  included  in  the 
calculation. 

Just  as  the  specific  characterization  of  independent  measures  needs  to  be  considered  in  applying  a 
risk  adjustment  method,  some  consideration  is  also  needed  regarding  the  treatment  of  the  dependent 
measure.  More  particularly,  the  specific  aims  of  this  study  focus  on  how  assumptions  regarding 
stoploss  thresholds  may  affect  the  application  of  alternative  methods.  Stoploss  reinsurance  will  be 
simulated  in  this  study  by  truncating  the  costs  included  in  the  dependent  measure  to  reflect  various 
threshold  levels.  While  there  is  very  little  research  in  the  application  of  reinsurance  modeling  to 
risk  adjustment  methods,  the  method  of  truncation  used  here  is  related,  at  least  tangentially,  to  other 
data  treatment  methods.  Robinson  et  al.  (1991)  truncated  charges,  suggesting  that  the  excess 
dollars  were  essentially  unpredictable.  Anderson  et  al.  (1990)  truncated  expenditures  at  the  99th 
percentile  because  the  upper  1  percent  were  deemed  too  difficult  for  any  model  to  predict. 
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Newhouse  et  al.  (1989)  reduced  the  most  extreme  outliers  to  the  level  of  the  mean  of  those  outliers. 
This  technique  is  also  referred  to  as  "Winsorizing." 

These  techniques  are  used  because  health  service  charge  data  commonly  include  large  outlier  values 
that  can  significantly  affect  parameter  estimates  when  calculating  expected  values  (rates)  as  well  as 
measures  of  rating  method  performance.  More  generally,  person-level  health  service  cost  data  can 
be  characterized  as  including  a  high  percentage  of  cases  with  no  or  minimal  cost  experience  in  any 
given  period  and  a  distribution  skewed  to  the  right  for  the  remaining  cases  because  of  exponentially 
greater  variance  in  cost  among  high-cost  cases  than  among  lower  cost  cases.  Such  a  distribution 
may  undermine  some  of  the  basic  assumptions  needed  to  make  dependable  statistical  inferences. 
Those  include,  particularly,  a  constant  variance,  linearity,  and  a  normal  distribution  for  the 
dependent  measure  (Armitage  and  Berry  1991,  Box  and  Cox  1964,  Hoaglin  et  al.  1983). 

In  order  to  improve  those  assumptions,  health  service  charge  data  used  as  a  dependent  measure  are 
often  transformed  using  the  log  of  the  actual  charge  amount  (Armitage  and  Berry  199 1,  Duan  et  al. 
1982,  Shwartz  and  Ash  1994).  Log  transformation  is  used  to  transform  health  service  cost  data 
because  it  is  known  to  minimize  problems  associated  with  increasing  variance  between  cases  as  the 
measure  increases,  to  improve  linearity  where  data  exhibit  a  consistently  increasing  slope,  and 
reduce  skewness  where  data  are  positively  skewed  (Armitage  and  Berry  1991). 

It  should  be  noted,  however,  that  using  the  log  is  one  of  a  range  of  transformations  that  might  be 
considered  for  any  given  set  of  data.  To  facilitate  that  consideration,  Box  and  Cox  (1964) 
developed  a  technique  for  finding  an  appropriate  transformation  given  a  specific  set  of  data.  One 
application  of  this  technique,  described  in  more  detail  in  Chapter  4  of  this  report,  involves 
calculating  the  slope  of  a  regression  of  median  values  of  selected  subgroups  of  a  stream  of  data  on 
a  nonparametric  measure  of  the  spread  of  values  within  those  subgroups  (the  "fourth-spread"). 
The  Box/Cox  methodology  suggests  a  power  value  (p)  that  can  be  read  as  the  most  appropriate 
exponent  to  use  to  transform  a  series  of  data.  That  is,  any  given  value  of  x  on  the  original  scale  will 
be  replaced  by  xp.  While  the  exponent  suggested  by  p  can,  in  theory,  take  a  range  of  values,  a  more 
limited  number  of  alternatives  is  typically  applied  in  practice.  Values  less  than  1  generally  suggest 
a  logarithmic  transformation,  and  increasingly  so  as  the  value  approaches  zero.  One  special  case, 
among  values  less  than  1,  isp  =  .5,  which  suggests  the  square  root  rather  than  the  log  of  values 
(Armitage  and  Berry  1991,  Hoaglin  et  al.  1983). 

One  significant  feature  of  regression  modeling  based  on  log-transformed  data  is  that  the  resultant 
expectations  are  log  values  that  have  to  be  "re-transformed"  into  dollar  amount  equivalents  if  they 
are  intended  to  be  used  to  set  payment  rates.  Further,  the  process  of  transforming,  regressing,  and 
then  re-transforming  data  will  shift  the  mean  of  the  total  expected  dollars  generated  by  the  model  to 
be  consistent  with  the  geometric  mean  of  the  values  on  the  log  scale.  The  implication  of  this  is  that 
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expected  values  from  log-based  regression  calculations  need  to  be  adjusted  to  the  original  scale 
mean  (dollars)  when  log  values  are  re-transformed  to  the  original  scale.  This  adjustment  is 
typically  achieved  by  "smearing"  some  portion  of  the  unaccounted-for  dollars  on  the  original  scale 
across  the  exponentiated  log  values  from  the  regression  model.  The  most  widely  used  method  for 
calculating  a  smearing  factor  was  developed  by  Duan  (1983),  and  consists  of  taking  the  mean  of 
the  exponentiated  residuals  from  the  log-model  regression.  The  exponentiated  expectations  from 
the  log  model  are  then  multiplied  by  the  smearing  factor  to  establish  a  final  expected  value  on  the 
original  scale. 

In  their  classic  treatment  of  charge  data,  Duan  et  al.  (1982)  examined  4  alternative  estimation 
techniques  designed  to  address  these  underlying  assumptions.  One  model  used  untransformed  data, 
another  used  the  log  of  those  data.  A  two-part  model  used  separate  equations  to  estimate  the 
probability  of  positive  expenses  and  the  level  of  nonzero  (log)  expenses.  A  four-part  model 
extended  the  two-part  version  to  include  separate  treatment  of  individuals  with  ambulatory-only 
and  hospital  expenses.  That  study  showed  that  the  simpler  models  lead  to  less  reliable  results, 
largely  because  of  untreated  bias  in  the  data.  Bias,  in  this  instance,  refers  to  the  potential 
inadequacy  of  using  the  resultant  expectations  to  predict  actual  costs  due  to  problems  related  to 
statistical  assumptions  regarding  the  underlying  distribution  of  the  original  cost  data. 

The  authors  of  the  study  noted,  however,  that  a  model  that  admits  a  small  amount  of  bias  to  achieve 
high  precision  may  outperform  a  model  free  of  bias  at  the  expense  of  precision.  Their  more 
complicated  models  may  have  been  subject  to  more  overfitting,  particularly  given  the  relatively 
small  data  sets  that  were  used  to  assess  them.  To  test  for  the  effects  of  overfitting  in  their  initial 
results,  the  researchers  conducted  split-half  analyses,  examining  the  mean  forecasting  bias  and 
mean  squared  forecasting  error  associated  with  each  model.  In  that  analysis,  they  could  not 
determine  whether  the  four-part  model  was  significantly  better  than  the  two-part  version,  although 
they  preferred  either  of  those  to  either  of  the  one-part  models. 

Although  the  Duan  study  was  designed  to  estimate  the  demand  for  health  care  services  across 
health  plans  generally,  rather  than  to  set  capitation  rates  in  particular,  these  results  have  been 
extended  to  the  development  of  risk  adjustment  methods  in  the  work  of  researchers  such  as 
Robinson  et  al.  (1991)  and  Rossiter  et  al.  (1994).  Wouters  (1991)  examined  the  application  of 
multipart  models  with  a  special  focus  on  the  technique  used  to  estimate  the  probabilities  for 
positive  expenditures,  substituting  a  tobit  analysis  for  logistic  or  probit  techniques  employed  by 
Duan  et  al.  (1982),  Robinson  et  al.  (1991),  and  Rossiter  et  al.  (1994).  Other  researchers  such  as 
Hombrook  and  Goodman  (1995)  have  focused,  instead,  on  models  based  on  untransformed  data 
under  the  assumption  that  more  complicated  data  treatments  do  not  make  enough  difference  to 
justify  their  use.  The  introduction  of  reinsurance  alternatives  to  such  modeling  may  also  mitigate 
the  potential  advantage  of  more  complicated  modeling. 


35 


Evaluating  Risk  Adjustment  Methods 

As  this  discussion  suggests,  the  modeling  and  simulation  of  risk  adjustment  methods  used  to  set 
prospective  payment  rates  is  typically  accomplished  using  individual-level  data.  Measures  used  to 
assess  the  relative  performance  of  particular  models  at  the  individual  level  include  those  drawn 
from  the  underlying  calculation  of  risk-adjusted  expectations  and  the  comparison  of  those 
expectations  to  actual  values  associated  with  each  individual  during  the  rating  period.  Among  the 
former  of  those  classes,  the  R2  is  the  standard  summary  measure  of  model  performance  when  the 
dependent  variable  is  continuous  (Shwartz  and  Ash  1994).  It  reflects  the  fraction  of  total 
variability  in  costs,  at  the  individual  level,  explained  by  the  various  risk  adjustment  methods.  This 
measure  is  typically  adjusted  for  the  number  of  factors  included  as  independent  measures,  when  the 
underlying  population  is  small.  A  higher  R2  generally  reflects  better  performance.  Mean 
forecasting  bias  and  other  measures  of  forecasting  error  are  among  standard  measures  that  reflect  a 
comparison  of  expected  and  actual  values 

While  an  R2  is  typically  thought  of  in  the  context  of  the  generation  of  expected  values,  it  can  be 
calculated  from  any  paired  sample  of  actual  and  expected  values.  In  the  case  of  split-half  analyses, 
R2  values  can  be  calculated  for  both  the  estimation  and  validation  (or  rating)  samples,  even  though 
the  validation  sample  is  not  used  to  derive  expectations.  As  discussed  above,  risk-specific 
parameters  are  generated  from  estimation  populations  drawn  at  random  from  a  study  population. 
Those  parameters  are  applied  to  a  validation  "half "  of  the  population  to  minimize  the  potential  for 
overfitting  in  assessing  model  performance.  Because  the  parameters  used  to  set  rates  for  the 
validation  sample  are  drawn  from  a  different  population  (the  estimation  sample),  the  R2  associated 
with  that  application  is  expected  to  be  lower,  generally,  than  those  calculated  for  the  initial 
estimation  sample.  An  R2  can  also  be  calculated  for  expectations  derived  from  log-based  models 
once  the  log  values  are  re-transformed  to  the  original  dollar  scale.  In  this  case,  the  R2  will  differ 
from  the  same  measure  calculated  from  the  log  values  because  of  the  adjustments  needed  to  convert 
those  values  to  the  original  scale.  It  is  possible  to  generate  a  negative  R2  in  some  instances, 
although  this  would  be  evidence  of  poor  predictive  power  (Shwartz  and  Ash  1994). 

The  general  formula  used  to  calculate  this  measure  is: 


The  R2  has  also  been  proposed  as  the  most  appropriate  measure  of  the  extent  to  which  any  given 
health  plan  may  be  able  to  "game"  a  risk-based  payment  system  (Newhouse  et  al.  1989,  1993).  As 
noted  in  Chapter  1,  no  risk  adjustment  system  can  reasonably  be  expected  to  account  for  all  the 
variation  in  health  care  costs.  That  is,  no  such  R2  is  likely  to  achieve  a  maximum  value  of  1 .0. 
Some  estimate  can  be  made,  however,  of  the  maximum  amount  of  variation  (the  maximum  R2)  that 


(Shwartz  and  Ash  1994) 
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might  reasonably  be  predictable  for  any  particular  sample  of  cases.  The  maximum  R2  reflects  an 
accounting  of  between-class  variation  versus  within-class  variation,  where  within-class  variation  in 
a  fully  defined  model  (one  in  which  all  available  risk  factors  have  been  included)  is  assumed  to  be 
randomly  distributed  and,  thus,  not  otherwise  explainable  in  the  risk  formulation  (Newhouse  et  al. 
1989,  Shwartz  and  Ash  1994).  This  maximum  value  also  provides  a  point  of  reference,  unrelated 
to  any  given  risk  adjustment  method,  for  the  relative  performance  of  various  risk  adjustment 
methods  across  differing  study  populations  and,  in  the  case  of  this  study,  across  differing  stoploss 
reinsurance  levels. 

As  described  in  more  detail  by  Newhouse  et  al.  (1989),  a  maximum  R2  can  be  calculated  using  a 
method  that  is  analogous  to  defining  a  dummy  variable  for  each  person  in  the  study  population. 
Longitudinal  data  covering  a  minimum  of  two  distinct  periods  (e.g.,  years)  are  used  to  estimate 
within-person  variation  across  time,  which  is  then  subtracted  from  total  variance  in  the  population. 
A  correction  is  also  made  for  bias  that  results  from  estimating  within-person  variance  from  a  finite 
time  series  (such  as  data  covering  only  two  years).  Standard  R2  values  can  then  be  assessed  as  a 
percentage  of  the  maximum  R2  established  for  each  population  as  one  indication,  for  example,  of 
the  extent  to  which  health  plans  retain  an  incentive  to  selectively  encourage  (or  discourage) 
enrollment  at  the  individual  level. 

As  described  in  Newhouse  et  al.  (1989),  the  maximum  R2  can  be  estimated  as: 


where:  Sa2  =  (TA  -  TM  -  (n  -  l)*Se2  )  /  (T  *  (n  -  1));  and.  Se2  =  (TG  -  TJ  /n.  Tis  the 


number  of  periods.  The  number  of  persons  is  n.  TA  is  the  sum  of  the  squared  total  expenditures 
across  periods  (T)  divided  by  T.  TM  is  the  sum  of  total  expenditures  times  itself  and  divided  by  the 
number  of  data  points  (T  *  n).  T0  is  the  sum  of  squared  expenditures  in  each  period. 

While  an  R2  is  a  common  general  measure  of  model  performance,  it  is  not  typically  considered  a 
direct  measure  of  the  accuracy  with  which  a  given  model  predicts  future  costs  (Hornbrook  and 
Goodman  1995).  Techniques  for  testing  the  predictive  accuracy  of  risk  adjustment  models  more 
commonly  involve  some  method  for  comparing  expected  values  generated  by  a  model  with  actual 
charges  incurred  by  some  validation  population.  Measures  of  predictive  accuracy  based  on 
individual-level  data  generally  reflect  the  disparity  in  distribution  between  expected  values  derived 
from  population  data  (a  large  sample)  and  the  more  widely  dispersed  actual  values  of  individual 
cases  (samples  of  1).  Nevertheless,  measures  are  drawn  from  individual-level  data  to  suggest  the 
overall  predictive  accuracy  that  might  be  expected  once  those  data  are  aggregated  into  groups. 
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Dunn  et  al.  (1995)  have  suggested  several  measures  that  address  specific  limitations  of  the  more 
general  R2  term.  Since  the  R2  is  a  measure  of  squared  error  and,  thus,  sensitive  to  a  few  extreme 
values  that  typically  appear  in  service  cost  data,  they  proposed  the  mean  absolute  difference 
between  expected  and  actual  values  as  a  way  to  minimize  that  effect.  They  also  proposed  using  the 
percentage  of  cases  within  specific  bands  of  error  to  characterize  the  actual,  rather  than  relative, 
size  of  the  error  associated  with  a  given  application  of  risk  adjustment  models. 

Ratios  of  observed  (actual)  to  expected  cost  values  are  also  commonly  used  as  a  standardized 
measure  of  how  well  a  given  model  predicts  future  costs.  A  ratio  of  1.00  indicates  perfect 
prediction.  The  mean  of  such  a  ratio  provides  an  overall  measure  of  the  bias  associated  with 
predicted  values  for  given  populations.  Duan  et  al.  (1982)  referred  to  this  bias  as  a  measure  of  the 
accuracy  with  which  a  method  predicts  costs.  They  described  the  mean  squared  error  of  predicted 
to  actual  values  as  a  measure  of  the  adequacy  of  a  given  method  because  it  reflects  how  well  that 
error  is  distributed.  Given  two  otherwise  comparable  models,  a  model  with  a  lower  mean  squared 
forecasting  error  is  preferred  because  that  error  is  more  evenly  distributed  across  the  underlying 
population. 

In  addition  to  individual-level  analyses,  researchers  have  drawn  systematic  or  random  samples  of 
individuals  to  assess  predictive  accuracy  at  the  group  level  using  similar  measures  of  bias. 
Hornbrook  et  al.  (1991),  for  example,  tested  the  predictive  accuracy  of  particular  subgroups  based 
on  age  and  gender  categories  drawn  from  their  overall  population.  Dunn  et  al.  (1995)  examined 
measures  of  prediction  error  at  both  the  individual  and  group  levels.  In  addition  to  measures  of 
bias,  the  percentage  of  cases  that  fall  within  specific  bands  around  perfect  prediction  are  used  in 
group-level  analyses,  although  the  width  of  such  bands  is  typically  narrower  than  that  used  at  the 
individual  level.  Dunn  et  al.  (1995),  Fowles  et  al.  (1994),  and  Weiner  et  al.  (1994)  each  used  a  5 
percent  band  around  a  predictive  ratio  of  1.00. 

Several  basic  results  of  statistical  theory  have  important  bearing  on  the  general  consideration  of 
group-level  measures.  One  is  that  the  mean  of  the  distribution  of  sample  means,  for  random 
samples,  is  the  same  as  the  mean  of  the  population  of  individual  measurements.  Another  is  that  the 
variance  of  sample  means  is  related  to  the  variance  of  the  individual  measurements  by  the  sample 
size.  A  third  basic  result  of  statistical  theory  is  that,  if  the  underlying  distribution  of  some  variable 
is  normal,  the  distribution  of  sample  means  will  be  normal.  Further,  the  Central  Limit  Theorem 
states  that  even  if  the  underlying  distribution  is  not  normal,  the  distribution  of  the  sample  mean  will 
become  closer  to  normal  as  the  sample  size  gets  larger  (Armitage  and  Berry  1991).  Overarching 
these  principles  is  the  understanding  that  the  mean  of  a  distribution  is  a  good  estimate  of  future 
expenditures  if,  "the  causal  and  correlative  system  which  operated  during  the  data  taking  has  not 
been  interfered  with  and  also  operates  during  the  period  when  predictions  are  being  made  (Box 
1966,  pg.  157)." 
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In  the  context  of  this  study,  parameter  estimates  for  specific  risk  factors  are,  essentially,  mean 
values  drawn  from  a  sample  of  cases  that  exhibit  any  given  factor.  Those  estimates  should  be 
better  as  the  sample  size  used  to  calculate  them  is  greater.  Individual-level  expectations  are  the 
sum  of  the  parameter  estimates  for  any  given  individual.  There  should  be  some  relationship 
between  the  integrity  of  the  underlying  parameter  estimates  and  the  individual-level  expectation. 
Group-level  expectations  are,  in  turn,  means  of  samples  of  the  individual-level  expectations.  There 
should,  intuitively,  be  some  kind  of  relationship  between  the  individual-level  expectations  and 
sample  means,  in  the  form  of  group-level  measures,  given  those  expectations. 

As  implied  by  the  statistical  theory  introduced  above,  the  standard  deviation  of  costs  for  random 
samples  of  individuals  of  any  given  size  is  a  function  of  the  variance  of  those  costs  at  the  individual 
level  and  the  number  of  individuals  in  the  group.  Expectations  derived  at  the  parameter  estimate, 
individual,  and  group  levels  should  be  better  estimates  of  associated  costs  as  the  sample  size 
underneath  each  level  of  expectation  is  greater.  That  relationship  isn't  necessarily  straightforward 
since  each  parameter  estimate  underlying  the  individual-level  expectations  could,  theoretically,  be 
drawn  from  a  different  underlying  distribution.  In  practice,  selection  issues  may  also  affect  that 
relationship  at  the  group  level  since  health  plan  members  do  not  necessarily  join  plans,  or  establish 
a  relationship  with  a  physician,  at  random. 

Group-level  measures  of  predictive  accuracy  emphasize  the  fact  that  capitation  payments  are 
typically  made  in  the  context  of  groups  of  enrollees.  Anderson  et  al.  (1990)  generated  random 
samples  of  5,000  Medicare  beneficiaries  to  simulate  populations  of  beneficiaries  enrolled  in  health 
plans  and  then  examined  mean  predictive  ratios  and  the  mean  product  moment  correlation  for  those 
groups.  A  study  by  Luft  and  Rosenkranz  (1993)  reported  in  Hornbrook  and  Goodman  (1995) 
grouped  individuals  based  on  the  level  of  their  expected  values  to  assess  whether  systematic 
residual  variance  was  omitted  across  low  to  high  expected-use  groups.  Robinson  et  al.  (199 1) 
selected  random  samples  of  various  sizes  from  a  fee-for-service  population  in  a  health  plan  to 
simulate  employer  groups.  Another  study  drew  random  samples  of  various  sizes  from  an  HMO 
population  and  then  "skewed"  some  portion  of  those  groups  to  reflect  high  and  low-use  groups, 
making  it  possible  to  assess  the  effects  of  group  size  and  bias  across  the  skewed  groups  (Fowles  et 
al.  1994).  Dunn  et  al.  (1995)  drew  both  random  and  nonrandom  samples  (based  on  previous 
diagnosis  or  cost  experience)  for  their  group-level  analyses.  Where  random  samples  are  drawn 
from  a  limited  population  to  test  the  predictive  accuracy  of  expectations  derived  from  regression 
models,  a  "bootstrap"  technique — whereby  an  individual  is  selected  from  a  sampling  population 
and  then  returned  to  the  sample  before  the  next  selection  is  made — is  commonly  used  to  generate  a 
sufficient  number  of  representative  groups  (Efron  and  Gong  1983). 

One  general  problem  with  randomly  generated  groups  used  in  these  studies  is  that  they  do  not 
reflect  the  actual  enrollment  pattern  of  individuals.  As  discussed  in  more  detail  earlier  in  this 
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chapter,  risk  adjustment  is  intended  to  address  (or  control  for)  selection  bias.  That  bias  may  be  a 
function  of  both  enrollee  and  provider  behavior.  The  process  of  randomizing  individuals  into 
groups  removes  the  influence  of  selection  effects  within  or  across  plans.  Strict  randomization  may 
also  remove  bias  that  is  inherent  in  any  given  risk  adjustment  method  that  might  otherwise 
distinguish  one  method  from  another.  For  example,  ADGs  reflect  health  status  as  embodied  in 
recorded  diagnoses.  Age  is  used  as  a  proxy  for  health  status  because  costs  are  generally  known  to 
be  associated  with  it  (age).  Age  also  reflects  any  bias  other  than  health  status  associated  with  it  in 
the  population  as  a  whole.  The  same  might  easily  be  said  for  including  gender  in  a  risk  adjustment 
formula.  In  other  words,  alternative  risk  factors  address  different  sources  of  bias  that  may  affect 
payment  across  groups  of  individuals.  Strict  randomization  obscures  all  forms  of  bias,  which  may 
make  it  difficult  to  assess  what  bias  is  actually  being  addressed,  and  how  effectively,  when 
comparisons  are  made  of  alternative  risk  adjustment  methods. 

Data  reflecting  naturally  occurring  groups  are  rarely  available  for  this  type  of  study.  In  this  study, 
record  of  primary  care  provider  (PCP)  assignment  will  be  used  to  identify  groups  of  health  plan 
members.  Random  sampling  will  be  used  at  that  group  (PCP)  level  to  establish  estimation  and 
validation  subsamples,  as  well  as  pseudo  groups  of  patients  of  various  sizes  within  the  validation 
subsample  to  support  group-level  analyses.  Sampling  will  be  accomplished  using  PCP  assignment 
as  recorded  in  the  study  data,  rather  than  sampling  individual  plan  members,  in  order  to  add  a 
modest  degree  of  natural  selection  to  the  group-level  analysis. 
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CHAPTER  3 
HYPOTHESES  AND  METHODS 


While  Chapters  1  and  2  described  the  broader  context,  this  chapter  provides  a  framework  of 
hypotheses  and  methods  that  will  be  applied  in  this  study  to  examine  the  relative  effects  of  risk 
adjustment  and  reinsurance  when  they  are  used  in  setting  risk  adjusted  health  service  payments.  An 
overview  of  the  data  available  for  this  study,  and  the  risk  adjustment  methods  and  reinsurance 
alternatives  defined  given  those  data,  is  presented  first.  Then  a  series  of  hypotheses  designed  to 
address  the  specific  aims  proposed  in  Chapter  1  is  introduced  along  with  the  measures  that  will  be 
used  to  test  each  hypothesis.  Limitations  of  the  study  are  also  noted.  More  specific  methods  used 
to  develop  and  treat  the  underlying  data  are  discussed  in  the  next  chapter.  The  primary  analysis  of 
results  and  summaries  of  the  measures  outlined  below  are  presented  in  Chapter  5. 

OVERVIEW  OF  DATA  AND  METHODS 

The  data  used  in  this  study  were  drawn  from  a  prior  research  project  and  made  available  under  a 
contractual  agreement  regarding  their  confidential  use  between  Johns  Hopkins  University  and  a 
major  health  insurance  carrier.  Several  issues  related  to  their  availability  are  noted  here  because  of 
their  impact  on  important  study  design  considerations,  such  as  which  risk  adjustment  methods  and 
reinsurance  alternatives  are  included. 

This  study  was  conducted  using  two  years  of  service  use  and  cost  data  drawn  from  two  moderately- 
sized  Independent  Practice  Association  (IPA)-model  HMOs.  IPA-model  HMOs  establish  a 
provider  network  by  contracting  with  independent  practitioners,  or  groups  of  practitioners,  who 
then  provide  services  to  the  HMO's  enrolled  members.  Health  plan  benefits  covered  by  the  HMO 
are  generally  limited  to  those  that  enrolled  members  receive  through  contact  with  the  provider 
network.  Each  of  the  plans  in  this  study  is  located  in  a  different  region  of  the  country.  Each 
provides  a  comprehensive  set  of  benefits  to  a — largely — employed  population  (including  both 
subscribers  and  dependents).  Neither  plan  has  a  formal  Medicare  enrollment  or  a  significant 
number  on  members  over  65  years-of-age,  although  both  include  a  small  number  of  working  aged 
or  members  covered  under  an  employer's  retirement  benefit  plan.  While  the  study  populations  are 
relatively  modest,  and  the  administrative  structure  of  the  plans  is  only  one  of  the  variety  of  forms 
capitated  delivery  systems  might  take,  results  from  this  study  should  be  more  widely  applicable 
whenever  prospective  payment  is  established  for  a  defined  population  that  exhibits  some  level  of 
selection  across  providers. 

Risk  Adjustment  Methods 

Since  the  stated  purpose  of  this  study  was  to  examine  the  contribution  of  one  factor  (reinsurance) 
in  the  broader  process  of  applying  risk  adjustment,  rather  than  to  develop  a  specific  risk  adjustment 
method,  the  methods  chosen  for  this  study  were  limited  to  examples  of  commonly  studied 
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alternatives  such  as  basic  demographic  criteria  and  the  ACG  case-mix  methodology.  Moreover,  the 
methods  considered  for  this  study  were  limited  to  those  most  appropriately  applied  to  nonMedicare 
populations.  This  precluded  the  use  of  PACS,  for  example.  This  also  precluded  the  use  of  the 
AAPCC  methodology,  although  the  simple  demographic  model  that  is  included  is  representative  of 
the  same  class  of  age  and  gender-based  methods.  While  DCGs  were  initially  developed  based  on 
inpatient  data  for  a  Medicare  population,  more  recent  versions  incorporating  diagnoses  drawn  from 
ambulatory  settings  have  been  included  in  at  least  one  comparative  study  based  on  a  nonMedicare 
population  (Dunn  et  al.  1995).  However,  the  algorithms  underlying  those  more  recent  versions 
were  proprietary  and  otherwise  not  available  at  the  time  the  analysis  reported  here  was  conducted. 
Timing  and  form  were  also  factors  in  considering  possible  methods.  The  data  for  this  study  consist 
of  service  claim  and  encounter  information  recorded  during  years  previous  to  this  study.  Thus, 
survey-based  adjustment  methods,  and  methods  based  on  customized  applications  of  proprietary 
employer  information  could  not  be  used. 

Versions  of  four  risk  adjustment  methods  that  have  been  specifically  proposed  for  use  with 
nonMedicare  populations  were  defined  for  this  study.  Again,  the  intention — here — is  not  to 
develop  a  specific  risk  adjustment  model,  but  rather  to  examine  how  types  of  models  typically 
suggested  for  prospective  payment  fare  in  the  presence  of  various  stoploss  levels.  The  methods 
included  do,  however,  represent  varying  levels  of  administrative  and  statistical  sophistication,  as 
well  as  a  range  of  the  explanatory  power  typically  associated  with  such  methods. 

The  first  method  is  a  simple  demographic  model  based  on  age  and  gender  alone.  Aside  from  its 
wide  acceptance  as  reflected  in  the  literature  in  this  area,  this  model  is  intended  as  a  baseline 
measure  of  the  contribution  of  other  methods.  Continuous  and  categorical  forms  of  age  were 
defined  in  separate  versions  of  this  method.  As  described  in  more  detail  in  Chapter  4,  where 
chronological  age  (a  continuous  variable)  and  gender  are  included  as  independent  measures  in  this 
study,  age  is  also  entered  as  a  squared  term  because  costs  exhibit  a  nonlinear  "U"  shape  by  age 
resulting  from  lower  costs  for  individuals  during  middle  years  and  higher  costs  for  the  youngest  and 
oldest  of  the  populations.  An  age/gender  interactive  term  is  also  included  because  of  differences  in 
the  distribution  of  costs  across  gender  categories  by  age.  Eight  age  groups,  recommended  by  the 
carrier  that  provided  the  study  data,  were  defined  for  the  categorical  version  of  age.  Specific 
independent  variables  for  these,  and  each  of  the  other  versions  of  methods  included  in  this  study  are 
listed  in  Table  3.1. 

A  second  risk  adjustment  method  was  based  on  the  presence  (or  absence)  of  a  chronic  condition. 
As  discussed  in  Chapter  2,  various  claim  and  survey-based  methods  that  include  this  information 
have  been  proposed  reflecting  the  fact  that  a  significant  portion  of  the  variation  in  health  service 
costs  from  year  to  year  is  attributable  to  chronic  conditions.  This  method  has  the  advantage  over 


42 


Table  3.1 :  Risk  Adjustment  Models 


short 
name 

A_G(1) 

MH>EL 

Age/Gender  (cont.) 

fcidependen*  VariaHes 

A              /""*        J           /"A           A    A         \      /A           *               J  \ 

Age,  Gender,  (Age  *  Age),  (Age  *  Gender) 

A_G(2) 

Age/Gender  (cat.) 

a          r  11        ■          r  ct\      v™»  1 

Age[l]  -Age[8],  Gender 

CHR(1) 

Chronic  (cont.) 

Liiroruc  rlag,  Age,  uender,  (A8^   Age),  (Age  Lender) 

CHR(2) 

Chronic  (cat.) 

Chronic  Flag,  Age[l]  -  Age[8],  Gender 

ACG 

ACG 

ACqi]-ACG[52] 

ADG(1) 

ADG(cont.) 

ADG[1]  -ADG[34],  Age,  Gender,  (Age  *  Age),  (Age  *  Gender) 

ADG(2) 

ADG(cat.) 

ADG[1]-ADG[34],  Age[l]  -  Age[8],  Gender 

(cont.=  continuous;  cat.  =catemrical) 

others  based  on  clinical  indicators  in  that  it  can  be  relatively  simply  defined  based  on  either  clinical 
sources,  such  as  claims  or  medical  records,  or  self-reported  survey  data.  One  recent  study  by 
Kronick  et  al.  (1995)  used  diagnoses  from  claim  records  to  identify  a  series  of  flags  reflecting 
chronic  conditions  in  order  to  assess  their  explanatory  power  to  determine  payment  rates  for 
disabled  Medicaid  populations  in  several  states.  Despite  the  fact  that  the  study  focused  on  such  a 
specialized  population,  the  general  concept  of  identifying  a  series  of  chronic  conditions,  rather  than 
one  flag,  might  have  been  useful  in  the  context  of  this  study.  However,  the  details  of  the 
categorization  used  in  that  earlier  study  were  not  available  at  the  time  this  study  was  conducted. 
Such  a  categorization  is  embodied  in  the  more  inclusive  framework  of  the  ADGs  that  are  used  in 
this  study.  For  the  purposes  of  this  study,  chronic  conditions  were  identified  using  ADGs 
specifically  defined  to  capture  those  conditions.  One  flag  is  used  to  reflect  the  presence  of  any 
chronic  condition.  Age  and  gender  are  included  in  these  models  because  they  are  not  otherwise 
accounted  for  in  the  chronic  condition  flag. 

A  third  method  was  based  on  the  ACG  case-mix  system.  Again,  as  noted  in  Chapter  2,  this  system 
is  routinely  suggested  as  a  possible  basis  for  risk  adjustment,  particularly  for  programs  that  include 
nonMedicare  populations.  ACGs  are  entered  as  a  series  of  mutually-exclusive  conditional  flags, 
one  for  each  ACG.  The  flag  associated  with  the  ACG  assigned  to  any  given  individual  is  coded  as 
"1".  All  other  ACG  flags  for  that  individual  are  coded  as  "0".  Age  and  gender  are  not  included  as 
independent  variables  in  this  model  because  they  are  already  accounted  for,  at  some  level,  in  the 
ACG  grouping  process. 

The  fourth  risk  adjustment  method  included  in  this  study  is  also  derived  from  the  ACG  system,  but 
based  on  ADGs.  ADG  models  are  often  included  in  analyses  of  the  ACG  system  because  of  the 


43 


added  explanatory  power  they  typically  exhibit  over  ACGs.  They  are  entered  as  a  series  of 
conditional  flags.  Age  and  gender  are  explicitly  included  in  these  models  because,  unlike  the 
determination  of  ACGs,  ADG  assignment  does  not  account  for  them  (Dunn  et  al.  1995,  Fowles  et 
al.  1994,  Weiner  et  al.  1994).  Appendix  A  includes  complete  listings  of  both  ACG  and  ADG 
categories. 

Reinsurance  Alternatives 

The  nature  of  the  data  available  for  this  study  also  affected  consideration  of  reinsurance 
alternatives.  Reinsurance  methods  based  on  specific  high-cost  conditions  were  considered,  but 
excluded  from  this  analysis  for  several  reasons.  First,  few  such  programs  exist  yet,  in  practice. 
Those  that  do,  tend  to  be  narrowly  defined  for  specific  populations.  Given  the  relatively  modest 
size  of  the  populations  included  in  this  study,  only  a  few  cases  of  the  high-cost  conditions  typically 
included  in  such  programs  occur  during  the  two  years  reflected  in  the  study  data.  Since  those 
conditions  do  not  necessarily  account  for  all  high-cost  cases,  some  level  of  truncation  similar  to  that 
associated  with  a  stoploss  level  could  be  deemed  appropriate.  Dunn  et  al.  ( 1 995)  were  able  to 
address  this  issue  in  part  by  defining  their  own  set  of  high-cost  conditions,  given  a  much  larger 
study  population.  Determining  such  a  list  of  conditions  based  on  data  from  this  study  would 
essentially  entail  identifying  high-cost  cases  regardless  of  the  diagnosis.  In  addition,  though 
somewhat  tangentially,  identifying  one,  or  a  few,  high-cost  condition(s)  in  such  small  populations 
could,  conceivably,  compromise  the  confidentiality  of  specific  individuals  if  that  information  is  ever 
connected  to  the  original  plans  from  which  the  data  were  drawn. 

This  study  focused,  instead,  on  the  effects  of  stoploss  reinsurance  as  a  method  to  limit  the  risk 
associated  with  health  service  payments.  In  particular,  it  was  intended  to  examine  the  relative 
contribution  of  various  stoploss  thresholds  in  the  application  of  existing  risk  adjustment  methods, 
and  within  the  context  of  other  data  treatment  methods  typically  employed  to  estimate  risk-adjusted 
payment  rates.  Stoploss  reinsurance  is  accounted  for  in  modeling  health  care  expenditures  by 
limiting  the  dollars  associated  with  any  given  individual  to  the  stoploss  level  established  for  the 
population  plus  any  applicable  coinsurance  amount. 

Four  stoploss  levels,  drawn  from  the  range  of  levels  reflected  in  the  literature  in  this  area,  were 
defined  for  this  study.  $50,000  was  chosen  as  the  highest  stoploss  threshold  because  relatively  few 
additional  cases  are  excluded  at  higher,  but  otherwise  appropriate  levels  (e.g.,  $75,000),  in  the 
study  population.  The  $50,000  threshold  is  used  under  some  state  programs  (NYSDH  1995),  and 
it  is  specifically  included  in  other  recent  literature  (Dunn  et  al.  1995).  $25,000  was  chosen  as  a 
second  threshold  level  because  it  has  been  routinely  used  in  other  related  studies  (Dunn  et  al.  1995, 
Fowles  et  al.  1994,  Robinson  et  al.  1991,  Weiner  et  al.  1994).  Moreover,  the  HMOs  that  supplied 
the  data  underlying  this  study  purchased  stoploss  coverage  at  this  level  during  the  study  period. 
$5,000  was  chosen  as  the  lowest  threshold  limit  in  large  part  because  it  is  commonly  used  in  state- 
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level  programs,  especially  those  that  address  the  small  employer-group  market  (AAA  1994C).  A 
$10,000  threshold  was  defined,  somewhat  arbitrarily,  as  a  crude  intermediate  figure  between 
$5,000  and  $25,000.  Robinson  et  al.  (1991)  also  used  this  figure  to  define  outlying  cases  within 
the  context  of  their  study. 

A  coinsurance  rate  of  10  percent  was  used  in  this  study.  This  aspect  represents  a  departure  from 
most  studies  that  examine  the  application  of  risk  adjustment  methods.  While  such  studies  typically 
include  some  level  of  truncation  comparable  to  a  stoploss  threshold,  they  do  so  for  more  narrowly 
defined  statistical  reason.  With  respect  to  the  use  of  reinsurance,  however,  coinsurance  is  routinely 
suggested  as  needed  so  that  health  plans  retain  some  incentive  to  provide  services  efficiently  once 
the  stoploss  threshold  is  reached,  particularly  at  lower  threshold  amounts.  Coinsurance  is  included 
in  this  study  simply  to  simulate  its  effects,  but  not  specifically  to  model  provider  behavior.  The  10 
percent  rate  used  here  is  intended  as  a  compromise  figure  representing  the  lower  range  of  those 
typically  suggested  in  the  literature  for  low  threshold  amounts,  but  an  intermediate  amount  given 
the  higher  thresholds  used  in  this  study  (AAA  1993,  Bovbjerg  1992,  Milliman  and  Robertson 
1991).  A  limited  amount  of  the  primary  analysis  described  here  will  also  be  repeated  using  no 
coinsurance. 

Study  Design 

Rating  expectations  were  calculated  using  the  transition  relationship  between  risk  adjustment 
factors  identified  in  data  covering  the  first  year  of  the  study  and  costs  incurred  in  the  second  year. 
In  keeping  with  other  transition-based  analyses,  each  HMO  population  was  split  into  estimation 
and  validation  "halves."  Parameters  associated  with  specific  risk  factors  were  calculated  using  the 
estimation  populations,  and  then  applied  to  the  validation  populations.  Assessment  of  model 
performance  focused,  primarily,  on  measures  derived  from  the  validation  population. 

As  noted  Chapter  2,  strict  random  assignment  into  groups  removes  some  of  the  effects  of  selection 
that  occur  within  health  plans  and  across  providers.  Unlike  most  previous  studies,  sampling  for 
this  study  was  based  on  existing  physician  assignment  to  take  advantage  of  more  naturally 
occurring  aggregations  of  health  plan  enrollees.  Each  enrolled  member  of  the  HMOs  that  underlie 
this  study  chooses,  or  is  assigned,  a  specific  primary  care  practitioner  (PCP)  who  serves  as  a  focal 
point-of-entry  to  health  plan  services.  It  was  this  assignment  that  was  used  to  establish  estimation 
and  validation  subpopulations  based  on  random  samples  of  PCPs,  rather  than  the  identification  of 
individual  plan  members.  In  other  words,  individuals  were  placed  into  either  estimation  or 
validation  subpopulations  based  on  their  PCP  assignment.  This  also  made  it  possible  to  use  that 
assignment  to  establish  pseudo  group  practices  of  varying  size  for  group-level  analyses  of  model 
performance.  It  should  be  noted  that  no  assumption  is  made  regarding  whether  the  PCPs  actively 
coordinated  the  members'  care.  PCP  assignment  is  used  here  simply  as  a  means  to  simulate 
aggregations  of  members  as  they  might  naturally  occur  in  a  health  plan. 
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Other  studies  have  also  drawn  samples  limited  to  specific  high-cost  or  chronic  conditions  to  test  the 
bias  inherent  in  risk  adjustment  associated  with  such  cases.  This  has  the  advantage  of  isolating 
potentially  important  sources  of  bias.  Again,  because  the  study  populations  were  of  relatively 
modest  size,  they  did  not  provide  a  sample  frame  sufficient  to  support  the  simulation  of  such 
groups.  The  use  of  PCP  assignment  in  sampling,  instead  of  individual-level  identifiers,  does 
introduce  a  modest  level  of  selection  bias  associated  with  that  assignment  that  may  be  evident  at  a 
population  level. 

Finally,  the  study  populations  were  limited  to  plan  members  who  were  enrolled  for  the  full  length  of 
the  study  period.  The  study  was  limited  in  this  way  in  order  to  ensure  that  the  population  used  to 
estimate  rating  expectations  was  similar  in  every  other  reasonable  respect  to  that  of  the  rating 
population.  ACG  assignment,  for  example,  was  based  on  12  months  worth  of  data.  Cost 
expectations  based  on  that  assignment  would,  ideally,  cover  a  similar  period  of  time.  In  actual 
practice,  intervening  factors  such  as  enrollment  and  disenrollment  patterns,  and  rates  of  death, 
would  have  to  be  included  in  setting  prospective  rates.  Such  factors  are  typically  treated  in  separate 
analyses  of  part-year  enrollees  that  go  beyond  the  more  narrow  purpose  of  comparing  reinsurance 
and  risk  adjustment  criteria  defined  for  this  study. 

HYPOTHESES 

Aim  1:  to  examine  the  contribution  that  various  stoploss  reinsurance  options  make  to 
the  relative  performance  of  risk  adjustment  methods. 

Hypothesis  la:  The  relative  performance  of  alternative  risk-adjustment  methods  will  be 
consistent  across  measures,  and  across  comparable  study  sites. 

Hypothesis  lb:  Measures  of  model  performance  associated  with  each  of  the  risk-adjustment 
methods  used  in  this  study  will  improve  with  the  application  of  successively  lower 
maximum  values  associated  with  individual-level  (stoploss)  reinsurance. 

Hypothesis  lc:  A  lower  reinsurance  level,  alone,  may  remove  some  of  the  relative  difference 
between  risk  adjustment  methods  evident  in  measures  of  model  performance. 

The  first  step  in  this  analysis  will  be  to  examine  individual-level  measures  of  model  performance  to 
establish  a  baseline  for  the  comparison  of  the  alternative  models,  and  respective  iterations. 
Maximum  R2  values  will  be  calculated  for  each  study  population  to  serve  as  relative  benchmarks  in 
the  analysis.  Separate  calculations  will  be  made  on  untruncated  costs  as  well  as  costs  given  each 
stoploss  level.  Subsequent  analyses  will  be  limited  to  truncated  cost  values  since  those  will  be 
most  representative  of  reinsurance  alternatives. 
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Model-specific  R2  values  then  will  be  calculated  given  each  of  the  risk  adjustment  models  listed  in 
Table  3.1.,  at  each  of  the  four  stoploss  levels.  Again,  individual-level  data  will  be  used.  Values  for 
both  the  estimation  and  validation  samples  will  be  calculated  using  the  formula  suggested  by 
Shwartz  and  Ash  (1994).  Measures  of  forecasting  error  will  be  derived  from  the  absolute  error 
associated  with  the  risk  adjusted  expectations  as  they  are  applied  to  the  validation  population. 
Following  the  example  of  Dunn  et  al.  (1995),  those  will  include  the  mean  absolute  error  and  the 
percentages  of  individuals  who  fall  within  targeted  bands  around  perfect  prediction. 

Aim  2:  to  compare  the  statistical  effects,  and  resultant  predictive  accuracy,  of 
reinsurance  modeling  and  common  mathematical  treatments,  such  as  log 
transformation  and  multipart  methods,  used  to  improve  the  dependability  of  inferences 
made  from  risk  adjustment  models. 

Hypothesis  2a:  Power  calculations  derived  from  Box-Cox  methodology  will  indicate  that 
transformation  of  the  dependent  measure  (total  service  costs)  is  appropriate. 

Hypothesis  2b:  Increasingly  complex  treatment  of  the  dependent  measure  will  increase  the 
R2  of  each  risk-adjustment  model  at  each  level  of  reinsurance. 

A  power  calculation  will  be  used  to  estimate  an  appropriate  transformation  for  the  data  as  a  whole 
in  each  study  plan,  and  at  each  reinsurance  level.  The  calculation  used  in  this  study  is  derived  from 
the  slope  of  a  spread- versus -level  plot  as  outlined  in  Hoaglin  et  al.  (1983),  using  a  decile  ranking  of 
total  service  costs  during  the  second  year  of  the  study.  Tests  such  as  normal  plots  and  an 
examination  of  variance  throughout  the  data  will  be  used  to  assess  the  contribution  transformation 
actually  makes  to  underlying  statistical  assumptions.  Log  transformation  will  be  included  in  this 
assessment,  in  any  case,  because  it  is  commonly  applied  in  such  modeling. 

Essentially,  three  data  treatments  in  addition  to  the  truncation  mimicking  stoploss  coverage  will  be 
applied  in  this  stage  of  the  analysis.  One  involves  using  the  log  of  associated  costs.  The  other  two 
are  multipart  variations  of  the  log  model  following  the  methods  described  by  Duan  et  al.  (1982). 
For  a  two-part  treatment,  estimation  parameters  will  be  derived  from  data  for  individuals  who  have 
some  claim  or  service-cost  experience.  The  probability  of  actually  incurring  service  costs  will  be 
calculated  using  logistic  analysis,  and  then  applied  to  the  prior  estimation  parameters.  A  four-part 
treatment  will  involve  separate  estimations  of  expectations  for  individuals  who  generate  costs 
related  only  to  ambulatory  services  and  those  who  have  some  history  of  inpatient  experience  during 
the  rating  period.  Smearing  factors  for  all  of  the  log  models  will  be  based  on  methodology 
developed  by  Duan  (1983).  The  summary  measures  described  for  the  analysis  of  the  first  set  of 
hypotheses  will  be  recalculated  and  examined  given  these  additional  data  treatments. 
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Aim  3:  to  explore  the  relationship  between  various  stoploss  reinsurance  thresholds  and 
subsequent  changes  in  group  (versus  individual)  measures  of  model  performance,  given 
a  selection  of  risk  adjustment  methods. 

Hypothesis  3a:  Group-level  measures  of  predictive  accuracy  will  improve  as  the  maximum 
value  associated  with  individual-level  stoploss  reinsurance  decreases. 

Hypothesis  3b:  Group-level  measures  of  predictive  accuracy  will  improve  with  increasingly 
complex  treatment  of  the  dependent  measure. 

The  first  two  sets  of  hypotheses  focus  on  individual-level  measures  of  model  performance  given 
various  stoploss  levels  and  other  data  treatments.  However,  rate-setting  occurs  within  the  context 
of  populations  of  individuals  subject  to  a  variety  of  selection  effects  discussed  in  Chapter  1. 
Hypotheses  3a  and  3b  are  intended  to  examine  some  assumptions  regarding  the  relationship 
between  those  individual-level  measures  and  group-level  measures  of  each  respective  model.  One 
assumption  is  that  group-level  measures  will  simply  reflect  model  performance  measured  at  the 
individual  level.  A  competing  assumption  is  that  the  process  of  grouping  individuals  will  "wash 
out"  differences  noted  at  the  individual  level.  A  third  assumption  that  has  received  very  little 
attention  in  published  literature,  but  that  is  implied  in  the  work  of  Duan  et  al.  (1982),  is  that  the 
form  of  the  dependent  measure  used  to  generate  individual-level  expectations  may  be  a  source  of 
bias.  That  bias  may  be  more  evident  at  the  group  level.  Generally,  to  what  extent,  and  how,  do 
measures  based  on  groups  of  individuals  reflect  similar  measures  based  on  individual-level  data? 

Measures  of  predictive  accuracy  at  the  group  level  will  be  based  on  ratios  of  expected-to-actual 
values.  They  will  include  the  mean  forecasting  bias,  the  mean  squared  forecasting  error  and  the 
percent  of  groups  that  fall  within  5  percent  of  a  predictive  ratio  of  1 .00.  Because  the  number  of 
enrollees  assigned  to  any  given  PCP  is  limited,  PCPs  (and,  thus,  individual  plan  members  assigned 
to  those  PCPs)  will  be  assigned  at  random  (with  replacement  at  the  group  level)  to  groups  of  PCPs 
to  mimic  enrollment  populations  of — roughly — 500,  1500,  3000,  and  5000  plan  members.  Sixty 
provider  groups  will  be  drawn  at  each  of  these  enrollment  levels.  Ratios  of  observed  to  expected 
values  will  be  calculated  at  the  group  level  and  summarized  across  groups  of  the  same  relative  size. 

As  noted  earlier,  PCP  assignment  is  used  in  this  study  simply  as  a  means  to  aggregate  individuals. 
There  is  no  assumption  beyond  that  association,  as  there  might  be  if  the  intention  was  to  profile 
physician  practice.  Moreover,  previous  studies  have  shown  that  groups  of  fewer  than  several 
hundred  members  produce  unstable  results  for  the  purpose  of  setting  payment  rates  (Robinson  et  al. 
1991).  There  were  too  few  PCPs,  in  either  plan,  to  establish  a  sufficient  number  of  groups  of  any 
significant  size  that  might  be  comparable  to  the  sets  of  60  groups  described  above.  Thus,  groups 
of  plan  members  identified  with  any  given  PCP  were  not  used  as  independent  units  in  this  study. 
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Since  the  same  sets  of  groups  will  be  used  across  each  of  the  seven  models,  the  group-level 
summary  measures  can  be  assessed  as  paired  analyses  across  the  different  models,  stoploss  levels, 
and  alternative  data  treatment  methods  within  any  given  group  size.  That  is  to  say,  for  example, 
that  the  ACG  model  results  for  groups  of  3,000  plan  members  and  a  stoploss  level  of  $25,000  can 
be  compared  to  results  for  groups  of  3,000  for  a  different  model  because  the  underlying  groups  are 
the  same  in  both  cases.  In  the  case  of  the  mean  squared  forecasting  error,  there  is  no  formal 
statistical  test  to  make  such  a  comparison,  although  a  lower  measure  is  preferable.  The  mean 
associated  with  forecasting  bias  can  be  subjected  to  a  statistical  comparison  of  paired  differences 
(0  across  model  alternatives  to  determine  statistically  significant  differences  underlying  comparable 
means.  The  percentage  of  groups  within  a  predictive  ratio  of  1 .00  can  be  thought  of  as  a  binomial 
distribution,  where  a  given  group  is  assigned  "1"  if  it  falls  within  that  band  and  "0"  otherwise. 
Thus  that  measure  can  be  formally  tested  using  a  McNemar  test  for  paired  proportions  (Armitage 
and  Berry  1991).  Each  of  these  tests  will  be  used  on  a  selective  basis  to  test  the  significance  of 
differences  that  are  evident  between  alternative  models. 
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CHAPTER  4 
THE  TREATMENT  OF  UNDERLYING  DATA 


This  study  was  designed  to  examine  the  relative  effects  of  individual-level  stoploss  reinsurance 
used  in  the  process  of  setting  risk-adjusted  capitation  payment  rates.  Data  drawn  from  two 
geographically  distinct  IPA-model  HMOs  were  used  to  set  cost  expectations  given  versions  of  4 
risk  adjustment  methods.  Particular  attention  was  placed  on  examining  the  effects  of  alternative 
data  treatment  methods  typically  used  to  generate  cost  expectations,  including  data  transformation 
and  multipart  modeling  techniques.  Both  individual  and  group-level  measures  of  model 
performance  were  used  in  this  examination.  Study  parameters  include  the  health  plan,  reinsurance 
levels,  risk  adjustment  methods,  other  data  treatment  methods,  and  various  aggregations  of 
individual  plan  members  that  simulate  enrollment  groups.  As  such,  the  operational  core  of  this 
study  involved  extensive  repeated  calculations  to  generate  payment  expectations  given  each  study 
parameter,  and  an  array  of  summary  measures  used  to  assess  their  relative  effects. 

Because  the  treatment  and  application  of  the  underlying  data  are  so  central  to  the  issues  examined 
in  the  study,  this  chapter  will  describe  that  activity  in  detail.  In  addition  to  the  technical  overview- 
presented  in  this  chapter,  Appendix  B  includes  further  detail  of  specific  aspects — where 
appropriate  and  noted  in  this  chapter — as  well  as  a  complete  set  of  study  summary  results. 

INITIAL  DATA  PROCESSING 

The  data  available  for  this  study  were  initially  processed  for  a  prior  study  of  how  risk  adjustment 
methods  might  be  applied  in  setting  payment  rates  and  in  profiling  physicians  (Weiner  et  al.  1994). 
The  author  participated  as  the  project  manager  and  senior  analyst  for  that  study.  The  procedures 
used  to  process  the  data  are  reported  here,  although  much  of  this  initial  work  was  conducted  prior 
to  the  author's  access  to  the  data  for  this  study. 

Both  HMOs  included  in  this  study  employed  the  same  underlying  administrative  data  management 
system.  The  data  were  initially  drawn  from  two  types  of  files:  membership  history  files  that 
contained  enrollment  and  disenrollment  dates,  age,  gender,  and  other  demographic  information; 
and,  utilization  or  claim  files,  exclusive  of  pharmacy  data.  The  claims  included  detailed 
information  on  every  service  billed  including  the  specific  procedure  performed  (CPT  code),  the 
diagnosis  code  (ICD-9-CM),  the  date  of  service,  and  the  dollar  amounts  billed  and  allowed. 
Payments  to  individual  providers  were  primarily  made  on  a  discounted  basis,  and  a  claim  was 
required  for  each  service  rendered.  Costs  used  in  this  study  were  based  on  allowed,  rather  than 
billed,  charges. 

Individual-level  records  were  drawn  from  the  membership  history  files  to  reflect  demographic 
information,  enrollment  history,  and  PCP  assignment  for  each  person.  Those  records  included  key 
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information  such  as  a  member  identification  number  that  could  be  used  to  link  general  membership 
information  with  specific  claim  data.  Claim  data  reflecting  dates  of  service,  diagnoses,  and  the 
allowed  costs  associated  with  related  services  were  aggregated,  to  the  individual  level,  into  annual 
summaries  based  on  two  consecutive  fiscal  years  beginning  July  1,  1990  and  ending  June  30,  1992. 
For  ease  of  reference,  the  two  study  periods  will  be  denoted  as  year-1  and  year-2  for  much  of  the 
remainder  of  this  chapter.  ACG  assignment  was  made  for  each  individual  for  each  12-month  fiscal 
period  based  on  all  recorded  diagnoses.  The  version  of  the  ACG  "grouper"  software  used  to  make 
that  assignment  was  the  most  current  available  to  the  original  study  in  1994.  All  data  were 
screened  for  duplicate  records. 

The  membership  and  claim  summary  data  were  merged  in  a  master  analysis  file  that  initially 
included  one  record  for  each  health-plan  member  enrolled  for  at  least  one  month  during  the  study 
period.  Separate  files  were  maintained  for  each  HMO.  For  the  purposes  of  this  study,  these  plans 
will  be  referred  to  as  "HMO- A"  and  "HMO-B".  The  initial  study  files  included  70,8 12  and  58,946 
individuals  for  HMO-A  and  HMO-B,  respectively. 

Study-Specific  Data  Development 

A  more  refined  analysis  file  was  generated  to  reflect  the  specific  needs  of  this  study.  First,  and  as 
discussed  in  Chapter  3,  the  study  was  limited  to  plan  members  who  were  enrolled  for  the  full  length 
of  the  study  period  in  order  to  ensure  that  the  population  used  to  estimate  rating  expectations  was 
similar  in  every  other  reasonable  respect  to  that  of  the  rating  population.  This  screen  also  made  it 
possible  to  calculate  the  maximum  R2,  which  requires  full  period-specific  data,  for  each  plan.  The 
resultant  populations  were  22,335  and  17,689  for  HMO-A  and  HMO-B,  respectively. 

Some  consideration  was  then  needed  to  establish  practitioner  association  within  each  plan  sufficient 
to  serve  as  a  focus  for  aggregating  plan  members  in  the  study.  Each  enrollee  in  each  study  plan 
chose,  or  was  assigned,  a  primary  care  practitioner  (PCP)  at  the  time  of  their  enrollment  who 
was — at  least  nominally — responsible  for  coordinating  the  care  that  member  received  through  the 
health  plan.  The  PCPs  in  the  study  plans  are  independent  practitioners  who,  typically,  maintain 
offices  in  single-member  or  small-group  practices  that  have  a  defined  contractual  arrangement  to 
provide  services  to  plan  members,  though  not  necessarily  on  an  exclusive  basis  with  the  plan. 
Between  300  and  500  PCPs  were  associated  with  each  of  the  study  plans.  From  1  to  approximately 
500  individuals  were  associated  with  any  given  PCP  in  the  study  populations. 

While  most  enrollees  retained  the  same  PCP  throughout  the  study  period  that  assignment  could 
change.  Plan  members  could  request  a  change  or  a  provider  may  have  left  the  plan,  for  examples. 
Up  to  9  PCP  changes,  as  well  as  the  starting  and  ending  dates  of  those  changes,  were  recorded  on 
the  original  enrollment  data  files.  Year-1  and  year-2  PCP  assignments  were  made  based  on  the 
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assignment  recorded  at  the  beginning  of  those  periods.  It  is  this  assignment  that  was  used  as  the 
basis  for  associating  individuals  with  specific  PCPs. 

Once  PCP  assignment  was  determined  for  these  analyses,  the  study  populations  were  split  into 
estimation  and  validation  subpopulations  based  on  that  assignment.  Each  PCP,  and  by  extension 
each  individual  associated  with  that  PCP,  was  assigned  at  random  to  one  of  two  groups  in  each 
plan.  The  first  group  was  used  to  estimate  the  parameters  associated  with  each  rating  criterion. 
Those  parameters  were  then  applied  to  the  second  group  to  establish  rating  expectations  that  could, 
in  turn,  be  compared  to  actual  service  costs  during  the  rating  period  (year-2).  While  PCPs  were 
randomly  assigned  to  each  major  subsample,  the  populations  of  individuals  that  make  up  the 
estimation  and  validation  samples  will  differ  to  the  extent  that  the  PCP  assignment  reflects 
selection  differences  across  PCPs.  In  other  words,  mean  costs  associated  with  each  sample,  for 
example,  may  differ  if  PCPs  who  attract  high-cost  patients  tend  to  fall  within  either  the  estimation 
or  validation  samples. 

PCP  assignment  was  also  used  to  establish  groups  of  plan  members  that  were  larger  than  those 
typically  associated  with  individual  practitioners.  PCPs  were  randomly  assigned  (with  replacement 
at  the  group  level)  to  form  groups  of  enrollees  of  approximately  500,  1500,  3000,  and  5000 
members.  Sixty  groups  were  formed  at  each  enrollment  level.  The  actual  number  of  enrollees 
varied  across  the  groups.  For  example,  the  number  of  members  in  the  largest  groups  ranged  from 
5001  to  5 176  for  HMO-A.  One  way  to  think  of  these  groups  is  as  "pseudo"  group  practices  with 
various  sizes  of  enrollment.  As  was  noted  with  respect  to  the  estimation  and  validation  samples, 
groups  of  enrollees  formed  on  the  basis  of  PCP  assignment  may,  potentially,  differ  in  some  evident 
respect  to  the  extent  that  PCP  assignment  reflects  selection  differences  across  PCPs.  Figure  4. 1  is 
a  schematic  representation  of  the  sampling  design. 

Exploratory  Data  Analysis 

The  next  step  in  the  analysis  was  to  examine  the  distribution  of  the  dependent  measure  (total 
service  costs)  in  each  plan.  Tables  4. 1  and  4.2  show  some  basic  statistics  derived  from  total 
service  costs  for  the  study  populations  in  HMO-A  and  HMO-B,  respectively.  Service  costs  are 
presented,  here  and  in  subsequent  tables,  as  per-member-per-month  (PMPM)  amounts  to  suggest 
an  association  with  monthly  capitation  rates.  Each  of  the  dollar  amounts  listed  in  the  tables  can  be 
annualized,  if  necessary,  by  multiplying  the  table  values  by  12.  Columns  (1)  and  (2)  of  the  tables 
show  measures  based  on  year-1  and  year-2  cost  data  for  the  study  populations  as  a  whole,  before 
any  truncation  related  to  stoploss  alternatives.  Columns  (3)  through  (6)  are  based  on  year-2  cost 
data  treated  to  reflect  the  4  stoploss  reinsurance  levels  ($50,000,  $25,000,  $10,000,  and  $5,000) 
included  in  the  study. 
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Figure  4.1 :  Schematic  for  Sampling 
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Row  1  of  each  table  shows  the  number  of  respective  plan  members  included  in  the  study 
population.  Row  2  shows  the  mean  service  costs  included  in  the  analyses  for  the  study  population 
before  any  truncation.  Row  3  shows  the  mean  costs  remaining  once  stoploss  was  modeled  using 
the  data.  That  is,  for  each  of  columns  (3)  through  (6),  individual  level  cost  data  were  truncated  at 
the  designated  stoploss  level.  Ten  percent  of  any  truncated  costs  were  added  back  at  the  individual 
level  to  mimic  a  copayment  amount  above  the  stoploss.    The  mean  of  charges  remaining  after 
truncation  (row  3)  goes  down  with  each  successively  lower  truncation  level  because  additional 
costs  are  being  removed  from  the  analysis  with  no  offsetting  return  of  those  costs.  An  implicit 
assumption  underlying  this  presentation  is  that  costs  that  are  removed  in  this  way  will  be  treated  as 
per-member  administrative  expenses  that  are  not  appropriately  included  in  the  calculation  of  risk 
adjusted  expectation.  Truncated  costs  would  clearly  have  to  be  included  in  an  overall  assessment  of 
the  effects  of  stoploss  coverage.  The  implications  of  such  costs  will  be  discussed  as  part  of  the 
more  detailed  analysis  presented  in  the  next  chapter. 

Rows  4  through  6  list  the  standard  deviation,  maximum,  and  minimum  values,  respectively,  for  the 
dollars  included  in  row  3.  Row  7  shows  the  number  of  cases  that  were  truncated  at  each  stoploss 
level,  and  row  8  shows  the  average  dollars  truncated  per  case.  Row  9  shows  the  average  dollars 
truncated  for  the  study  population  as  a  whole.  For  the  columns  that  reflect  truncated  costs,  the  sum 
of  rows  3  and  9  equals  the  dollar  amount  shown  in  row  2. 
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Several  additional  points  are  worth  noting  regarding  the  data  presented  in  these  tables.  Although 
the  subsequent  analyses  involved  separate  treatment  of  estimation  and  validation  subpopulation, 
some  initial  measures,  particularly  the  maximum  R2  and  power  estimation  calculations,  were 
derived  using  data  for  the  population  as  a  whole.  Separate  analyses  were  made  at  each  of  the  4 
stoploss  truncation  levels  but,  again,  using  data  for  the  population  as  a  whole,  rather  than  the 
estimation  or  validation  samples  alone. 

The  formula  used  to  calculate  the  maximum  R2  (Max  R2),  shown  in  row  10  of  Tables  4. 1  and  4.2, 
was  derived  from  Newhouse  et  al.  (1989)  and  described  in  the  previous  chapter.  The  results  based 
on  untruncated  costs,  as  well  as  costs  given  each  truncation  level  are  shown  in  the  tables.  As  those 
data  show,  the  Max  R2  is  .402,  or  40  percent  for  untruncated  costs  in  HMO-A  (row  10  of  Table 
4. 1),  and  just  under  30  percent  at  each  of  the  4  truncation  levels.  HMO-B  had  a  lower  measure 
based  on  untruncated  costs  (.3 16)  that  remained  much  the  same  across  truncation  levels.  The  Max 
R2  was  roughly  30  percent,  given  truncated  costs,  in  both  study  plans. 

As  has  been  noted  in  previous  chapters,  the  distribution  of  health  service  costs  is  typically  skewed, 
in  part,  because  of  relatively  few  high-cost  cases  in  any  given  general  population,  and  because  the 
variance  between  individual  cases  is  much  greater  at  high-cost  levels  than  at  lower  levels.  It  is  also 
important  to  remember  that  the  expectations  that  underlie  risk  adjustment  are,  essentially,  sample 
mean  values  associated  with  given  risk  factors.  The  mean  of  a  sample  drawn  from  some  larger 
population  of  events  (such  as  all  individuals  who  might  fall  into  a  specific  ACG  category)  is  an 
unbiased  estimate  for  that  larger  population,  when  the  variable  being  estimated  is  normally 
distributed  in  the  larger  population  (Armitage  and  Berry  1991,  Hoaglin  et  al.  1983).  If  the 
underlying  distribution  is  skewed,  the  mean  of  a  sample  drawn  from  that  distribution  will  also  be 
skewed.  The  work  of  Duan  et  al.  (1982)  described  in  Chapter  2  was  specifically  designed  to 
address  the  limitations  of  a  typical  distribution  of  health  service  costs  on  making  such  statistical 
inferences.  Again,  as  discussed  in  Chapter  2,  a  power  calculation  can  be  made  to  deteraiine  which, 
if  any,  of  a  series  of  possible  transformations  might  be  appropriate  to  improve  the  reliability  of 
statistical  inferences  made  from  the  data.  The  power  calculation  used  in  this  study  was  derived 
from  the  slope  of  a  spread- versus-level  plot  as  outlined  in  Hoaglin  et  al.  (1983),  using  a  decile 
ranking  of  total  year-2  service  costs.  This  approach  entails  regressing  measures  of  the  spread  of 
values  within  specific  subsets  of  the  data  (the  fourth-spread)  on  the  medians  of  those  subsets.  The 
power  value  (p)  is  derived  by  subtracting  the  slope  of  that  regression  from  1.00.  The  individual- 
level  cost  data  for  each  plan  were  split,  based  on  a  rank  order  of  those  values,  into  ten  groups 
across  which  the  power  calculation  was  made. 

Power  values  less  than  1.00  generally  indicate  that  log  transformation  is  appropriate  (Armitage  and 
Berry  1991,  Hoaglin  et  al.  1983).  That  is  increasingly  so  as  the  value  approaches  zero.  One 
notable  exception  is  that  a  value  of  .5  may  indicate  that  a  square  root,  rather  than  log, 
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transformation  is  appropriate.  Power  values  reported  in  Tables  4. 1  and  4.2  (row  11)  suggest  that  a 
log  transformation  is  broadly  appropriate,  for  both  study  plans,  for  these  analyses.  That  is 
particularly  true  for  HMO-B  with  a  value  of .  176  to  .009  across  the  truncation  levels  reported.  The 
p  values  for  HMO-A  at  higher  truncation  levels  (.299,  shown  in  row  1 1  of  Table  4.1)  might 
suggest  either  a  log  or  a  square  root  transformation,  since  they  are  closer  to  .5  than  to  zero. 
However,  in  the  absence  of  a  more  definitive  indication  that  a  square  root  transformation  was 
necessary,  the  log  was  deemed  most  appropriate  for  this  study  because  of  the  prominence  of  its  use 
in  the  literature,  and  in  order  to  maintain  consistency  of  treatments  across  the  two  plans  in  this 
study.  As  a  further  check  on  the  need  to  transform  the  data,  power  calculations  were  also  made 
using  alternative  groupings  of  individuals  based  on  the  independent  measures  of  age  and  gender. 
Separate  power  calculations  were  made  given  two  groups  defined  by  gender,  and  8  groups  defined 
by  age  categories.  In  each  case,  the  resultant  power  values  were  significantly  less  than  1.00,  and 
less  than  .5. 

Figures  4.2  and  4.3  are  normal  probability  plots  drawn  from  data  for  HMO-A  that  illustrate  the 
effect  that  a  log  transformation  has  on  the  distribution  of  the  study  data.  A  normal  probability  plot 
is  used  to  compare  the  distribution  of  a  set  of  data  to  a  standardized  normal  distribution.  Data  that 
are  clearly  drawn  from  a  normal  distribution  will  present  as  a  relatively  straight  line  running  from 
the  lower  left  to  the  upper  right  of  such  a  plot.  Figure  4.2  reflects  nonzero  values  of  the  raw  data. 
Those  raw  data  are  highly  skewed  to  the  right.  Figure  4.3  reflects  the  log  values  of  those  data.  The 
log  values  indicate  a  marked  improvement  in  the  distribution  of  the  data.  At  the  same  time,  this 
transformation  does  not  take  care  of  the  problem  entirely. 

A  Kolomogorov  test  for  goodness  of  fit  to  a  normal  distribution  was  used  on  the  data  underlying 
Figures  4.2  and  4.3.  That  statistic  is  a  measure  of  how  similar  two  distributions  are.  A 
Kolomogorov  value  of  .012  (from  pg.  718,  in  Daniel  [1987]:  p=.  99,  n= 19963,  table  D  =  1.63  - 
141.29)  would  suggest  that  there  is  no  significant  difference  between  the  distribution  of  the  study 
data  and  a  normal  distribution.  That  value,  given  the  actual  cost  values  reflected  in  Figure  4.2,  was 
.412.  The  Kolomogorov  value  given  the  log  values  reflected  in  Figure  4.3  was  .083.  While  these 
measures  suggest  that  the  underlying  distribution  of  both  actual  and  log  values  is  not  completely 
normal,  log  values  are  clearly  closer  to  that  ideal  than  the  actual  cost  values  in  this  study.  This 
result  helps  to  reinforce  the  indication  of  the  power  calculations  that  log  transformation  is  a 
defensible  alternative  to  using  actual  cost  values  in  estimating  expectations  from  those  data. 

Practicality  and  convention,  rather  than  statistical  precision  alone,  also  plays  a  role  in  considering  a 
form  for  transformation  (Box  and  Cox  1964).  A  power  value  (p)  suggests  an  exponent  that  might 
be  used  to  improve  the  distribution  of  a  set  of  data  that  is  used  to  make  statistical  inferences.  As 
indicated  by  the  values  reported  for  HMO-A  in  Table  4.1,  that  might  entail  transforming  data 
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Figure  4.2:  Normal  Probability  Plot,  Year-2  Service  Costs  (HMO-A) 
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Figure  4.3:  Normal  Probability  Plot,  Log  of  Year-2  Service  Costs  (HMO-A) 
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truncated  at  $50,000  with  an  exponent  of  .299.  The  log  or  square  root  are  proposed  as  first  order 
considerations  for  p  values  less  than  1.00  because  they  are  more  readily  understood  and  more  easily 
applied  as  a  standard  than  more  precise  exponents.  Using  log  values  in  this  way  has  the  secondary 
advantage  of  being  consistent  with  common  practice  for  the  treatment  of  population-level  cost  data 
(Armitage  and  Berry  1991,  Hoaglin  et  al.  1983).  Thus,  the  log  of  service  costs  is  used  as  the  first 
level  transformation  for  data  from  both  plans  in  this  study.  Figure  4.4  illustrates  the  general 
pattern  of  data  treatment  used  in  this  study. 

Figure  4.4:  Schematic  for  Data  Treatment 
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Common  assumptions  regarding  the  distribution  of  service  costs  across  age  and  gender  were  also 
considered  in  examining  the  study  data.  Figure  4  .5  is  a  plot  of  average  total  costs  in  year-2  of  the 
study  by  age  in  year-1  for  HMO-A.  A  cross-year  comparison  is  used  in  this  graph  to  highlight  the 
transition  relationship  between  risk  factors  (in  this  case,  age)  and  the  dependent  measure  (costs). 
This  plot  shows  that  average  costs  exhibit  a  slight  hook,  or  "U",  shape.  Costs  go  down  slightly 
with  age  among  the  very  young  and  then  rise  steadily,  overall,  with  a  dip  in  the  middle  years.  This 
pattern  suggests  some  nonlinearity  of  the  dependent  measure.  One  common  way  to  address  this 
result  is  to  enter  age  as  a  squared  term  in  regression  modeling,  when  age  is  included  as  a  continuous 
variable. 
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Figure  4.5:  Year-2  Service  Costs  by  Year-1  Age  (HMO-A) 
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Figure  4.6:  Year-2  Service  Costs  by  Year-1  Age  and  Gender  (HMO-A) 


60 


Figure  4.6  reflects  the  same  data  as  those  included  in  the  previous  graph,  though  in  this  case 
average  costs  are  distributed  by  gender  as  well  as  by  age.  This  graph  indicates  that  there  are  some 
differences  in  the  general  pattern  exhibited  in  the  distribution  of  costs  by  age.  Women  tend  to  have 
higher  average  costs  during  child-bearing  years,  than  do  men.  An  age/gender  interaction  term  is 
typically  included  as  an  independent  variable  to  address  this  issue — again,  when  age  is  entered  as  a 
continuous  variable.  Together,  Figures  4.5  and  4.6  suggest  that  it  is  appropriate  to  include  a 
squared  term  for  age  and  an  age/gender  interaction  term  when  age  is  treated  as  a  continuous 
variable  in  this  study.  The  distribution  of  charges  by  age  and  gender  for  HMO-B  exhibited  very 
similar  patterns  as  those  described  here  for  HMO-A. 

Tables  4.3  through  4.6  characterize  the  distribution  of  service  costs  by  specific  risk  factors.  Tables 
4.3  and  4.4  show  total  service  costs  by  the  year- 1  age  and  gender  categories  that  are  used  in  this 
study  for  HMO-A  and  HMO-B,  respectively.  Year-1  and  year-2  costs  are  shown  for  the  study 
populations  as  a  whole.  Year-2  costs  are  presented  for  the  estimation  and  validation 
subpopulations  as  well.  Tables  4.5  and  4.6  reflect  the  distribution  of  those  costs  by  year-1  ACG 
categories  for  each  respective  plan.  Generally,  the  distribution  of  costs  by  risk  factor  is  very  similar 
across  the  two  study  plans.  That  includes  the  fact  that  neither  plan  produced  any  members  in  ACG 
15  (Psychosocial,  with  psychiatric  major,  with  psychiatric  minor).  Both  plans  significantly  limit 
mental  health  benefits. 

One  exception  worth  minor  note  is  that  HMO-A  has  a  significant  number  of  enrollees  who  fall  into 
ACG  51.  This  category  is  designed  for  those  who  generate  identifiable  costs  but  do  not  otherwise 
have  diagnoses  that  place  them  in  another  ACG.  From  an  examination  of  the  original  claim  data,  it 
was  clear  that  HMO-A  recorded  a  small  monthly  charge  (on  the  order  of  from  $3  to  $9)  for  some, 
though  not  all,  of  its  members.  This  charge  represents  some  portion  of  a  routine  capitation 
payment  to  providers.  HMO-B  did  not  appear  to  record  such  a  payment.  At  the  same  time,  HMO- 
B  had  a  higher  percentage  of  individuals  who  had  no  service  use  during  year- 1  (ACG  52). 
Including  that  administrative  payment  in  HMO-A  might,  theoretically,  improve  results  for  that  plan 
if  those  payments  reflect  individuals  who  are — somehow —  "active"  nonusers  who  might  then  be 
more  likely  to  use  services  in  the  following  year.  In  point  of  fact,  year-2  cost  amounts  do  not 
clearly  distinguish  those  in  ACG  5 1  from  those  in  ACG  52  in  HMO-A.  Year-2  costs  more  clearly 
suggest  a  difference  between  those  categories  in  HMO-B,  although  the  number  of  individuals  in 
ACG  5 1  is  small.  The  small  administrative  payment  amounts  in  HMO-A  seem  to  "wash-out"  the 
difference  between  ACG  5 1  and  ACG  52.  This  example  is  also  indicative  of  what  would  happen 
on  a  larger  scale  if  the  costs  that  were  removed  in  the  process  of  truncating  the  original  data  for  this 
study  were  included  in  the  calculation  of  expectations,  rather  than  treated  as  a  separate  component 
of  the  analysis. 
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As  a  cautionary  note,  Tables  4.5  and  4.6  show  that  some  ACG  categories  have  very  few  cases  from 
which  to  draw  parameter  estimates.  There  are  only  3  cases  in  ACG  27  in  either  plan,  for  example. 
Parameter  estimates  for  those  categories  are  likely  to  be  unstable.  It  is  not  immediately  clear  how 
much  of  a  problem  these  small  numbers  present.  It  is  also  not  clear  exactly  what  the  source  of  the 
problem  is.  Some  of  the  small  numbers  are  likely  to  be  the  result  of  how  the  benefit  package  is 
designed,  as  was  the  case  with  ACG  15.  Small  numbers  in  some  categories  are  also  likely  to  be  due 
to  the  limited  size  of  the  study  population.  In  any  case,  no  attempt  was  made  to  adjust  the  ACG 
system  to  these  particular  populations  with  the  understanding  that  at  least  some  small  amount  of 
bias  may  be  introduced  given  those  small  numbers.  In  this  case,  bias  may  be  introduced  because 
there  are  too  few  cases  from  which  to  draw  a  good  (stable)  estimate  of  costs  associated  with  some 
ACG  categories. 

Generating  Expected  Values 

Seven  distinct  models,  based  on  four  risk  adjustment  methods,  were  identified  for  this  study,  and 
introduced  in  Chapter  3.  Methods  reflecting  simple  demographic,  chronic  flag,  and  ADG  factors 
are  included  using  both  continuous  and  categorical  treatments  of  age.  Age  and  gender  are  included 
in  the  decision  rules  underlying  the  grouping  of  diagnoses  into  ACGs  and,  thus,  are  not  overtly 
included  in  the  regression  formulation  for  that  model.  Each  of  the  seven  models  was  used  as  the 
basis  for  regression  calculations  to  generate  expected  values. 

Table  4.7  lists  the  actual  variables  that  were  used  in  the  underlying  calculations.  This  table  is 
different  from  a  similar  table  in  the  previous  chapter  in  that  the  actual  variable  names  used  in  the 
statistical  calculations  are  listed.  In  addition,  certain  "reference"  variables  are  excluded.  When 
categorical  data  are  defined  in  a  regression  formula,  one  category  is  excluded.  The  effect  of  that 
category  on  the  resultant  expected  values  is  subsumed  in  the  intercept  term.  The  intercept  term 
represents  an  initial  dependent  variable  value  to  which  the  contribution  of  other,  overtly  defined 
factors,  can  be  added  to  determine  the  expected  value  for  any  given  individual. 

For  example,  since  the  variable  GENDER  is  defined  as  "1"  if  an  individual  is  female,  and  "0"  if  the 
person  is  male,  males  are  reflected — by  default — in  the  intercept  term  for  those  models  that  include 
gender.  The  parameter  associated  with  GENDER  is  added  to  the  intercept  term  when  it  is  coded  as 
"1"  (female)  for  any  given  individual,  to  account  for  the  relative  contribution  to  the  model  of  being 
female.  Similarly,  individuals  with  some  indication  of  a  chronic  condition  were  defined  as  "  1 ",  and 
"0"  otherwise,  so  that  the  intercept  term  of  those  models  reflects  individuals  with  no  chronic 
condition.  The  age  category  "35-44  years"  was  the  reference  category  when  AGE  was  defined  as  a 
categorical  variable.  Thus,  the  intercept  term  for  the  fully  categorical  chronic-flag  model  can  be 
interpreted  as  reflecting  males,  ages  35-44,  with  no  chronic  condition. 
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Table  4.7:  Model  Definitions  for  Regression  Analysis 


short 

:  Mxfel  ;■:  : 

kidependent  VariaHes 

A_G(1) 

Age/Gender  (cont.) 

Age91,  Gender,  Asq91,  Cross91 

A_G(2) 

Age/Gender  (cat.) 

Age[l]-Age[5],  Age[7]  -  Age[8],  Gender 

CHR(1) 

Chronic  (cont.) 

Chronic  Flag,  Age,  Gender,  Asq91,  Cross91 

CHR(2) 

Chronic  (cat.) 

Chronic  Flag,  Age[l] -Age[5],  Age[7]  -  Age[8],  Gender 

ACG 

ACG 

ACG[1]-AC0[51] 

ADG(1) 

ADG(cont.) 

ADG[1]  -  ADGf_34],  Age,  Gender,  Asq91,  Cross91 

ADG(2) 

ADG(cat.) 

ADG[1]-ADG{34],  Age[l]  -  Age[5],  Age[7] -Age[8],  Gender 

{coot.= continuous;  cat.  =  categprical) 

(Dependent  Measure:  Total  Service  Costs) 

Since  there  are  4  truncation  levels  ($50,000,  $25,000,  $10,000,  and  $5,000)  and  4  secondary  data 
treatment  options  (1  actual-dollar  and  3  log  models  -  see  Figure  4.4),  16  sets  of  expected  values 
were  generated  for  each  of  the  7  risk  adjustment  models  in  Table  4.7.  Once  all  the  related 
calculations  were  completed,  a  total  of  1 12  expected  values  was  associated  with  any  given 
individual  in  the  study. 

Appendix  B  includes  output  from  the  computer  software  used  for  this  study  that  lists  selected 
parameter  estimates  generated  for  HMO-A.  Those  estimates  are  presented  simply  to  illustrate  the 
process  of  generating  expectations  from  regression  analyses.  They  are  not  intended  as  formal 
results  of  this  study,  and  should  not  be  used  except,  perhaps,  for  testing  purposes.  The  output 
included  in  the  Appendix  reflects  each  of  the  7  models  defined,  using  a  stoploss  level  of  $25,000, 
and  based  on  both  the  one-part  model  using  actual  dollar  and  the  four-part  model  using  logs. 

The  calculation  of  expected  values  progressed  in  a  kind  of  sequence  through  each  of  the  secondary 
data  treatment  options.  Expected  values  were  generated,  first,  using  the  actual  (not  transformed) 
dollar  totals,  truncated  at  the  4  stoploss  levels.  The  one-part  model  was  then  repeated  using  the  log 
of  the  respective  cost  amounts  in  the  models.  Since  it  is  not  possible  to  define  the  log  of  0  (zero), 
$1  was  added  to  the  underlying  total  service  costs  for  each  individual  in  the  one-part  version  of  the 
log  models  so  that  those  who  did  not  generate  any  costs  in  year-2  of  the  study  could  be  included. 
This  dollar  was  subtracted  in  the  process  of  converting  expected  values  to  the  original  dollar  scale. 

Following  the  pattern  of  analysis  described  by  Duan  et  al.  (1982),  two  multipart  models  were  then 
established  as  alternatives  to  the  one-part  log  model.  In  a  two-part  model,  initial  expected  values 
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were  derived  from  only  those  who  actually  used  services  in  year-2.  In  order  to  complete  the 
calculation  of  expectations  for  the  population  as  a  whole,  those  initial  values  were  adjusted  for  the 
probability  that  those  who  exhibit  any  given  risk  factor  will  actually  use  services.  A  logistic 
analysis  was  used  to  determine  probabilities  of  service  use  from  the  study  data. 

Finally,  a  four-part  model  was  established  in  which  those  who  exhibit  inpatient  costs  were  analyzed 
separately  from  those  who  generated  ambulatory  costs  alone.  Overall  probabilities  of  use  are  the 
same  as  those  calculated  for  the  two-part  model.  However,  a  second  level  probability  was  also 
calculated  to  adjust  for  the  rate  at  which  individuals  fall  within  the  "ambulatory  only"  and 
"inpatient"  subgroups.  Expectations  for  the  four-part  model  are  then  derived  from  the  sum  of  the 
expectations  for  each  of  the  subgroups  multiplied  by  the  overall  probability  of  use.  In  statistical 
notation  the  four-part  model  can  be  expressed  as: 

po,  *  {{(i-py  *  AOi  *  sAO]  +  [pi,  *  i,  *  sj} 

where:  POj  is  the  probability  of  generating  any  costs  at  all  for  any  given  individual  given  their 
respective  risk  factors;  Plj  is  the  probability  that  an  individual  will  use  inpatient  services — if  they 
are  likely  to  use  any  services  at  all;  AOj  is  the  exponentiated  expectation  from  the  regression  of 
those  who  used  ambulatory  services  only;  and,  Ij  is  the  exponentiated  expectation  from  the 
regression  of  those  who  used  some  inpatient  services. 

SAO      Sj  are  "smearing"  factors  used  to  adjust  expectations  derived  from  log  models  for  the 
mean  of  the  original  dependent  measure  once  those  expectations  are  expressed  in  the  original 
(actual  dollar)  form.  The  method  developed  by  Duan  (1983)  was  used  as  the  primary  basis  for 
smearing  estimates  in  this  study. 

Table  4.8  is  presented  to  illustrate  the  effect  of  transforming  and  re-transforming  the  underlying 
cost  data  in  these  analyses,  and  the  extent  of  smearing  involved  in  each  of  the  log-based  models.  In 
this  exhibit,  data  reflect  the  12005  individuals  in  the  estimation  sample  used  to  calculate  cost 
expectations  for  HMO- A.  The  first  row  of  the  table  shows  the  mean  untransformed  dollars  that 
were  included  in  the  calculations.  The  columns  of  this  table  show  mean  dollar  values,  for  4  of  the 
risk  adjustment  alternatives,  at  each  stage  of  the  process  of  developing  expected  values  on  the 
original  dollar  scale  from  the  log  models.  The  mean  of  expected  values  calculated  using  a  one-part 
actual-dollar  model  is  the  same  ($75. 17)  regardless  of  risk  adjustment  method  because  there  is  no 
transformation  involved  in  those  calculations.  When  expected  values  calculated  using  a  one-part 
log  model  were  exponentiated  to  the  original  dollar  scale,  the  mean  of  those  expectations  before  a 
smearing  estimate  was  applied  was  $19.98  for  the  A_G(1)  method,  and  less  than  35  percent 
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(approximately  $26)  of  the  original  raw-dollar  mean  ($75  . 17)  for  any  of  the  4  risk  adjustment 
methods.  After  a  smearing  estimate  was  applied  to  those  expectations,  mean  expected  charges 
were  between  94  and  106  percent  of  the  original  dollar  amount.  The  two  and  four-part  log  models 
accounted  for  higher  percentages  of  the  original  dollar  mean  in  the  log  values  generated  for  those 
models.  The  four-part  model  produced  nearly  the  same  mean  expected  charges  as  the  original 
dollar  mean  once  the  smearing  estimates  were  applied  (100  to  102  percent  of  mean  raw  dollars). 
Mean  expected  values  for  all  the  models — for  both  the  estimation  and  validation  samples — are 
included  in  Appendix  B. 

SUMMARY  MEASURES 

Once  expected  values  were  established  given  each  of  the  conditions  in  the  study,  a  series  of 
summary  measures  was  calculated  to  assess  the  relative  performance  of  each  model.  Individual- 
level  measures  were  derived  by  treating  each  health  plan  member  in  the  study  population  as  an 
independent  case — that  is,  having  an  equal  weight  in  the  analysis.  By  contrast,  group-level 
measures  subsume  the  effects  of  specific  individuals  within  groups  of  plan  members,  and  each  of 
those  groups  is  given  equal  weight  in  the  analysis. 

Individual-Level  Measures 

As  discussed  in  previous  chapters,  the  R2  is  the  principal  summary  measure  used  at  the  individual 
level.  It  reflects  the  extent  to  which  a  model  "explains"  the  underlying  variation  in  the  dependent 
variable.  The  values  used  in  this  study  are  adjusted  for  the  number  of  parameters  included  in  each 
respective  model.  Several  considerations  that  complicate  the  determination  of  that  measure  are 
important  to  remember,  however.  In  general,  while  an  R2  is  typically  listed  in  the  output  of 
computer  software  used  in  regression  analyses,  that  measure  is  most  relevant  to  the  data  actually 
used  in  the  calculation.  Two-part  models,  for  example,  involve  the  regression  of  data  for  only  those 
individuals  who  generate  service  costs.  The  second  "part"  of  that  modeling,  where  probabilities  of 
service  use  are  applied  to  parameters  derived  from  the  regression  calculations,  will  also  effect  the 
extent  to  which  those  models  ultimately  explain  underlying  variation,  as  will  the  process  of  re- 
rransforrning  parameter  estimates  derived  from  log  values  to  the  original  scale  and  the  application 
of  smearing  estimates. 

Adjusted  R2  values  derived  from  both  the  estimation  and  the  validation  subpopulations  are  of  at 
least  passing  interest  in  this  study.  The  estimation  samples  represent  the  initial  calculations  and  the 
validation  samples  represent  the  application  of  those  initial  values.  Since  both  samples  are  drawn, 
to  some  extent,  at  random  from  the  same  population,  marked  differences  in  the  R2  for  the  two 
sample  would  indicate  some  basic  problem  of  comparability  across  the  samples.  The  estimation 
sample  is  likely  to  exhibit  slightly  higher  values  given  the  overfitting  issue  that  the  split-half 
approach  is  designed  to  address.  All  of  the  adjusted  R2  values  reported  in  this  study  were 
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calculated  independently  of  the  underlying  regression  analyses  using  the  formulation  described  in 
Shwartz  and  Ash  (1994).  Adjusted  R2  values  for  all  the  underlying  expectations  in  this  study  are 
included  in  Appendix  B,  including  both  the  estimation  and  validation  samples. 

Mean  absolute  prediction  error  was  derived  from  the  individual-level  data  reflecting  the  validation 
samples  in  each  of  the  study  plans.  Three  "bands"  similar  to  those  reported  by  Dunn  et  al.  (1995) 
were  also  defined  using  those  individual-level  data.  The  first  two  bands,  the  percentages  of 
absolute  error  within  $25  and  $50  of  $0  (zero),  are  intended  as  measures  of  the  distribution  of  error 
around  perfect  prediction.  The  third  band  is  the  percentage  of  absolute  error  more  than  $400, 
which  is  a  measure  of  the  extreme  limit  of  the  absolute  error.  The  dollar  values  used  here  reflect 
per-member-per-month  differences  during  the  rating  period.  These  measures  are  also  reported  in 
Appendix  B,  for  all  of  the  underlying  expectations  in  the  study. 

Group-Level  Measures 

Once  the  individual-level  expectations  were  established,  both  those  expectations  and  the  actual  cost 
experience  for  each  individual  were  aggregated  to  the  group  level  based  on  the  group  assignment 
process  described  above.  Group-level  measures  of  predictive  accuracy  were  then  summarized  for 
each  risk  adjustment  model  by  truncation  level,  other  data  treatment  type,  and  group  size.  The 
measures  defined  for  this  analysis  are  intended  to  characterize  group-level  performance  of  each 
model  in  much  the  same  way  as  those  defined  for  the  individual-level  analysis.  The  primary 
difference  is  that  groups  subsume  much  of  the  variation  exhibited  at  the  individual  level.  Each  of 
the  group-level  measures  is  reported  as  a  percentage,  rather  than  an  actual  cost  value,  associated 
with  differences  between  actual  and  expected  values. 

The  group-level  measures  in  this  study  were  derived  from  one  predictive  ratio,  calculated  for  each 
group  by  dividing  expected  by  actual  costs.  An  expected-to-actual  ratio  above  1.00  indicates  that 
the  expectation  for  that  group  was  greater  than  the  group's  actual  costs.  The  specific  group-level 
summary  measures  included  are  the  mean  forecasting  bias,  the  mean  squared  forecasting  error  of 
the  group-level  prediction,  and  the  percentage  of  groups  that  fall  within  5  percent  of  perfect 
prediction  (1.00).  Mean  forecasting  bias  was  calculated  as  the  mean  of  the  difference  between  1.00 
and  the  expected-to-actual  ratio.  Positive  measures  of  bias  indicate  that — on-average  across  the 
relevant  collection  of  60  groups — expected  values  exceeded  actual  costs.  The  mean  squared  error 
associated  with  each  model  was  calculated  from  the  differences  underlying  the  mean  forecasting 
bias,  and  a  smaller  error  suggests  a  better  (more  equitable)  dispersion  of  forecasting  error  when 
comparing  models.  The  number  of  groups  within  5  percent  of  1.00  is  the  sum  of  those  with 
absolute  values  of  forecasting  bias  less  than  .05. 

Table  4.9  is  presented  to  illustrate  the  summarization  of  group-level  measures.  The  number  of 
individuals  and  the  mean  of  year-2  actual  costs  for  each  group  are  shown  to  the  left  of  the  table. 
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Mean  expected  costs  are  listed  for  each  risk  adjustment  model  along  with  the  predictive  ratio  (in 
parentheses),  for  each  group.  Group-level  summary  measures  are  shown  at  the  bottom  of  the  table. 
Once  again,  mean  forecasting  bias  is  the  average  of  the  difference  between  1  and  the  predictive 
ratio  (across  the  60  groups).  Mean  squared  forecasting  error  is  the  average  of  the  squared 
forecasting  bias.  The  percent  of  groups  within  5  percent  of  average  is  the  number  of  groups  with 
predictive  ratios  less  than  .05  divided  by  60. 
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CHAPTER  5 
RESULTS  AND  ANALYSIS 

INTRODUCTION 

The  initial  goal  of  this  study  was  to  get  a  better  understanding  of  the  relationship  between  risk 
limitation  associated  with  stoploss  reinsurance  and  risk  adjustment  that  is  used  to  account  for 
differences  in  the  distribution  of  health  service  costs.  Three  sets  of  hypotheses  were  proposed  in 
Chapter  3  to  provide  a  structure  for  examining  that  relationship  in  the  context  of  issues  that  might 
affect  the  assessment  of  risk  adjustment  methods.  By  the  end  of  this  chapter,  a  reader  should 
understand  the  nature  and  limitations  of  risk  limitation  applied  in  the  process  of  risk  adjustment. 
However,  the  key  to  understanding  the  effects  of  reinsurance,  or  any  seemingly  independent 
component  of  the  broader  process  of  applying  risk  adjusted  payment  rates,  is  to  come  to  terms  with 
the  nature  and  extent  of  bias  associated  with  each  factor  that  contributes  to  that  process  as  a  whole. 

Bias  can  be  broadly  defined  as  a  discrepancy  between  expected  and  actual  outcomes.  With  respect 
to  setting  capitation  rates,  risk  adjusted  expectations  are  assumed  to  differ  from  actual  outcomes  to 
some  definable  degree.  The  extent  to  which  those  expectations  fail  to  account  for  actual  costs  is  a 
measure  of  bias  in  the  expectations  that  may  be  introduced  from  a  variety  of  sources.  Bias 
associated  with  reinsurance,  bias  that  is  a  function  of  assumptions  regarding  the  distribution  of 
health  service  costs  and,  to  a  limited  extent,  bias  that  is  inherent  in  specific  risk  adjustment  methods 
are  each  isolated  in  the  following  analysis. 

In  the  process  of  examining  reinsurance  and  risk  adjustment,  this  research  and  the  following 
analysis  also  explore  the  nature  of  a  more  fundamental  relationship  between  individual  and  group- 
level  measures  used  to  assess  the  performance  of  risk  adjustment  methods.  The  principal 
individual-level  measure  of  model  performance,  the  R2  associated  with  the  generation  of  cost 
expectations,  has  been  proposed  as  the  most  appropriate  criterion  for  assessing  risk  adjustment 
methods  under  the  assumption  that  health  plans  operationalize  risk-selection  behavior — that  is, 
they  selectively  encourage  and  discourage  good  and  bad-risk  patients — based  on  individual 
differences  (Newhouse  1994).  Other  researchers  have  suggested  that  focusing  on  group-level 
measures  of  predictive  accuracy  is  a  more  appropriate  way  to  assess  risk-adjustment  methods,  in 
part  because  they  reflect  how  those  methods  are  actually  applied  (Hornbrook  and  Goodman  1994, 
Rossiter  et  al.  1994,  Welch  1985). 
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Methods  and  standards  for  making  group-level  assessments  in  the  application  of  risk  adjustment 
methods  are  not  well  established — in  practice  or  in  the  literature.  The  following  analysis  introduces 
a  technique  for  assessing  the  application  of  risk  adjustment  methods  at  the  group  level  within  the 
context  of  data  truncation  (associated  with  stoploss  reinsurance  in  this  study)  and  the  relative  size 
of  the  groups.  This  technique  draws  on  basic  statistical  principles  regarding  variance,  sample  size, 
and  the  estimation  of  means  to  identify  the  existence  of  bias  in  risk  adjusted  expectations.  It  also 
provides  a  platform  for  further  investigation  into  the  sources  of  that  bias. 

In  order  to  draw  a  clear  distinction  between  a  perspective  based  on  individual-level  measures  and 
one  based  on  group-level  measures,  the  following  analysis  treats  the  assessment  of  risk  adjustment 
methods  from  those  perspectives  independently.  The  first  two  sets  of  hypotheses  were  written  from 
the  perspective  of  individual-level  analyses.  The  third  set  of  hypotheses  was  written  from  a  group- 
level  perspective.  One  consequence  of  adhering  to  this  structure  is  that  a  hypothesis  written  from 
the  individual-level  perspective  regarding  the  performance  of  log  models  (hypothesis  2b)  is  rejected 
in  analysis  from  that  perspective.  That  rejection  is  subsequently  identified  as  inappropriate  when 
analyzed  from  the  perspective  of  group-level  measures.  This  approach  is  used  because  of  the 
primacy  of  individual-level  measures  in  the  literature  related  to  the  assessment  of  risk  adjustment 
methods. 

The  author  acknowledges  that  the  results  of  this  study  raise  more  questions  than  they  answer.  The 
analysis  that  follows  introduces  some  issues  that  are  necessary  or  useful  to  acknowledge  but  that 
are  not  possible  to  examine  more  fully  in  the  more  narrow  context  defined  for  this  analysis.  For 
example,  the  performance  of  ACGs  derived  using  actual  (untransformed)  dollars  is  shown  to  be 
more  noticeably  sensitive  to  changes  in  data  truncation  levels,  relative  to  other  methods,  based  on 
individual-level  measures  of  predictive  accuracy.  By  the  end  of  the  analysis,  it  is  reasonably  clear 
that  the  individual-level  results  reveal  information  about  bias  in  the  underlying  expectations  rather 
than  a  characteristic  of  the  ACG  system.  That  bias  was  evident  because  ACGs  and  ADGs  are  so 
closely  related.  A  full  treatment  of  how  bias  might  be  revealed  in  individual-level  measures  is 
beyond  the  scope  of  the  following  analysis.  The  limitations  of  this  study,  and  the  implications  for 
further  research  embodied  in  its  results  will  be  discussed  in  the  final  chapter  of  this  report. 

An  analysis  of  the  initial  regression  calculations  and  a  detailed  comparison  of  expected  to  actual 
costs  are  presented  in  this  chapter.  With  some  important  exceptions,  the  results  reported  here  were 
very  similar  across  the  two  study  plans.  Thus,  the  following  discussion,  and  attendant  examples, 
will  focus  primarily  of  one  of  those  plans,  HMO- A.  A  complete  set  of  summary  results  for  each  of 
the  plans  is  included  in  Appendix  B. 
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INITIAL  INDIVIDUAL-LEVEL  ANALYSIS 

Hypothesis  la:  The  relative  performance  of  alternative  risk-adjustment  methods  will  be 
consistent  across  measures,  and  across  comparable  study  sites. 

Hypothesis  lb:  Individual-level  measures  of  model  performance  associated  with  each 
of  the  risk-adjustment  methods  used  in  this  study  will  improve  with  the  application  of 
successively  lower  truncation  levels  associated  with  stoploss  reinsurance  thresholds. 

Hypothesis  lc:  A  lower  reinsurance  level,  alone,  may  remove  some  of  the  relative 
difference  between  risk  adjustment  methods  evident  in  measures  of  model  performance. 

The  first  set  of  hypotheses  posed  in  Chapter  3  was  intended  to  establish  a  baseline  relationship 
across  risk  adjustment  methods  and  stoploss  levels.  Some  relevant  underlying  assumptions  were 
that  the  relative  performance  of  risk  adjustment  methods  will  be  similar  to  the  application  of  the 
same  type  of  methods  in  other  studies,  that  the  process  of  truncating  individual  cases  of  extreme 
costs  at  increasingly  lower  levels  will  improve  performance,  and  that  lower  levels  of  truncation  may 
offset  the  relative  performance  of  some  alternatives  methods. 

The  analysis  in  this  section  shows  that  alternative  risk  adjustment  methods  perform  in  a  predictable 
pattern — relative  to  each  other — across  comparable  populations  (health  plans),  and  that 
increasingly  lower  truncation  levels  improve  individual-level  measures  of  model  performance. 
However,  it  also  suggests  that  increasingly  lower  truncation  levels  exaggerate,  rather  than 
moderate,  differences  between  alternative  risk  adjustment  methods. 

Variation  Explained 

Table  5. 1  presents  a  series  of  initial  individual-level  measures  for  each  of  the  risk  adjustment 
models,  for  each  of  the  stoploss  levels.  The  table  reflects  the  regression  models  using  actual  (not 
transformed)  dollars  associated  with  service  costs  in  HMO-A.  Measures  of  explained  variation 
(based  on  the  adjusted  R2s)  are  presented  for  both  the  estimation  and  validation  samples.  Measures 
of  predictive  accuracy  shown  in  the  table  were  derived  from  the  validation  population.  Section  (a) 
of  the  table  shows  that  the  risk  adjustment  methods  do,  indeed,  perform  much  as  expected  given 
previous  research  based  on  nonMedicare  populations.  The  simple  demographic  model  produced 
the  lowest  adjusted  R2  values,  explaining  3.4  to  4.6  percent  of  the  variation  in  costs,  across  stoploss 
levels,  for  the  estimation  sample.  Adding  a  flag  for  the  presence  of  a  chronic  condition  to  the 
demographic  models  explained  an  additional  1.4  to  2.6  percent  of  that  variation.  ACGs  improved 
on  the  chronic  flag  models  further,  on  the  order  of  from  1  to  4  percentage  points,  depending  on  the 
level  of  truncation  associated  with  stoploss.  Models  based  on  ADGs  performed  the  best  with 
improvements  of  2  to  3  percentage  points  relative  to  the  ACG  model. 
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This  pattern  of  relative  improvement  across  the  four  basic  risk  adjustment  methods  is  generally 
constant  throughout  these  individual-level  results.  The  same  can  be  said  about  gradual 
improvements  in  the  adjusted  R2  values  associated  with  each  model  at  increasingly  lower  truncation 
levels.  However,  the  absolute  difference  in  R2s  between  each  basic  risk  adjustment  method  tends  to 
increase  slightly  at  successively  lower  stoploss  threshold  levels.  This  seems  to  be  the  result  of  both 
differences  in  baseline  R2  values  (at  the  $50,000  stoploss  level)  and  the  rate  of  change  across 
levels.  R2  values  for  the  CHR(1)  and  ACG  models  are  .014  apart  with  a  $50,000  threshold  in 
HMO-A,  for  example.  That  difference  is  .019  at  the  $25,000  level,  .036  at  the  next  lower  level, 
and  .046  at  the  lowest  threshold  level.  While  the  rate  of  change  in  R2  from  one  level  to  the  next  is 
roughly  equivalent  across  methods,  ACGs  had  consistently  higher  rates  than  the  other  methods 
with  approximately  28  percent  increases  across  the  highest  three  truncation  levels,  and  17  percent 
between  the  $10,000  and  $5,000  levels.  There  was  still  a  greater  absolute  difference  between 
ACGs  and  ADGs  at  the  lowest  stoploss  level.  The  implication  of  this  finding  is  that  increasingly 
lower  stoploss  levels  reveal  more,  rather  than  less,  difference  between  alternative  risk  adjustment 
methods,  as  proposed  in  hypothesis  lc. 

Models  based  on  a  continuous  treatment  of  age  consistently  outperformed  those  that  included  age 
categories,  although  by  a  very  small  amount  in  each  case.  By  minor  contrast  to  the  increasing 
absolute  difference  across  the  four  basic  risk  adjustment  methods  at  lower  stoploss  levels,  the 
difference  in  R2  values  associated  with  categorical  versus  continuous  treatments  of  age  for  any 
given  method  tends  to  decrease  with  lower  threshold  levels  because  of  the  relatively  higher  rate  of 
change  across  those  levels  for  categorical  models.  It  is  fair  to  note,  that  the  age  categories  defmed 
for  this  study  were  those  suggested  by  actuaries  in  the  parent  company  of  the  two  plans.  Some 
alternative  categorization  might,  potentially,  make  up  for  some  of  the  differences  reflected  here. 
Nevertheless,  a  continuous  treatment  of  age  produces  predictably  better  results,  than  a  categorical 
treatment,  as  measured  by  the  individual-level  adjusted  R2  associated  with  each  treatment. 

Measures  reported  in  Appendix  B  show  that  HMO-B  produced  much  the  same  pattern  of  results  as 
those  just  described  for  HMO-A.  R2  values  increased  at  the  same  general  rate  from  demographic, 
to  chronic  flag,  to  ACG,  and  to  ADG  models.  Those  values  increased  with  lower  stoploss 
thresholds,  and  differences  between  methods  were  greater  at  successively  lower  levels.  ACGs  had 
the  highest  rate  of  change,  again  going  up  approximately  28  percent  across  the  highest  stoploss 
levels.  R2  values  for  the  demographic  and  chronic  flag  models  were  slightly  higher  at  higher 
threshold  levels  in  HMO-A.  Values  for  ACGs  were  essentially  the  same  across  the  two  plans. 
Results  for  the  ADG  models  were  lower  in  HMO-A.  The  R2s  for  all  but  the  categorical  version  of 
the  demographic  model  were  higher  at  the  $5,000  truncation  level  in  HMO-B.  As  a  body,  these 
findings  suggest  that  alternative  risk  adjustment  methods  perform  consistently  when  they  are 
applied  in  comparable  organization. 
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Table  5.1: 
SU>pbm 

Individual-Level  Measures  by  Stoploss  Level,  using  i 

lntransformed  dollars 

(hmo-a: 

A_G{1) 

1"; 

\JH2l  i  CHR(1) 

|  CHR(2)  | 

ACG 

[  ADG( 

i) 

I  ADG{2] 

(a)  Adjusted  R-Square  Estimation  Half 


(b) 


(e) 


50k 

0.034 

0.027 

0.048 

0.043 

0.062 

0.084 

0.080 

0.041 

0.034 

0.058 

0.053 

0.079 

0.104 

0.100 

1UK 

0.044 

0.039 

0.066 

0.064 

0.102 

0.127 

0.124 

5k 

0.046 

0.043 

0.072 

0.071 

0.118 

0.145 

0.143 

Adjusted 

R-squarc  as  %  of  Max  R-square 

Estimation  Half 

50k 

11.7% 

9.3% 

16.6% 

14.8% 

21.4% 

29.0% 

27.6% 

25k 

14.1% 

11.7% 

20.0% 

18.3% 

27.2% 

35.9% 

34.5% 

10k 

15.2% 

13.4% 

22.8% 

22.1% 

35.2% 

43.8% 

42.8% 

5k 

15.9% 

14.8% 

24.8% 

24.5% 

40.7% 

50.0% 

49.3% 

Adjusted  R-Squarc 

Validation  Half 

50k 

0.029 

0.024 

0.040 

0.037 

0.042 

0.065 

0.063 

25k 

0.037 

0.032 

0.051 

0.048 

0.061 

0.085 

0.083 

10k 

0.045 

0.040 

0.065 

0.063 

0.086 

0.111 

0.110 

5k 

0.047 

0.043 

0.071 

0.070 

0.097 

0.126 

0.124 

Mean  Absolute  Error 

Validation  Half 

50k 

96.09 

96.24 

95.34 

95.36 

93.20 

91.90 

91.97 

25k 

89.76 

90.02 

88.98 

89.11 

86.91 

85.52 

85.67 

10k 

78.44 

78.77 

77.60 

77.80 

75.62 

74.13 

74.31 

5k 

65.41 

65.70 

64.54 

64.68 

62.63 

61.27 

61.39 

Percent  of  Absolute  Error  Within  $25 

Validation  Half 

50k 

22.9% 

22.7% 

29.4% 

28.0% 

26.0% 

37.6% 

38.1% 

25k 

23.5% 

23.5% 

29.8% 

28.2% 

27.7% 

38.2% 

38.1% 

10k 

25.1% 

25.0% 

29.0% 

28.2% 

32.4% 

38.1% 

37.9% 

5k 

28.3% 

28.2% 

31.9% 

30.8% 

41.7% 

40.8% 

41.5% 

(f)  Percent  of  Absolute  Error  Within  $50  Validation  Half 


(g) 


50k 

54.1% 

45.6% 

54.1% 

54.3% 

60.3% 

57.6% 

56.6% 

25k 

54.7% 

48.5% 

54.4% 

54.3% 

61.4% 

58.2% 

57.2% 

10k 

56.1% 

52.4% 

57.2% 

56.2% 

63.1% 

60.9% 

60.3% 

5k 

63.3% 

61.5% 

63.2% 

62.3% 

67.5% 

66.4% 

66.7% 

Percent  of  Absolute  Error  More  Than  $400 

Validation  Half 

50k 

3.2% 

3.2% 

3.2% 

3.1% 

2.9% 

2.8% 

2.7% 

25k 

3.3% 

3.2% 

3.2% 

3.1% 

2.9% 

2.8% 

2.8% 

10k 

3.4% 

3.3% 

3.3% 

3.3% 

3.1% 

2.9% 

2.9% 

5k 

1.1% 

1.1% 

1.0% 

0.9% 

1.1% 

0.9% 

1.0% 

■ 
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The  relationship  between  the  R2  values  and  a  theoretical  maximum  R2  (Max  R2)  is  reflected  in 
Table  5. 1  section  (b).  It  is  included  in  this  table  largely  as  a  reference  to  the  fact  that  only  a  limited 
amount  of  the  underlying  variation  in  health  service  costs  is  likely  to  be  predictable  on  a  population 
basis  in  any  case.  The  Max  R2  is  also  commonly  discussed  and  reported  in  related  literature.  The 
maximum  values  used  in  this  table  are  those  reported  earlier,  in  Chapter  4,  and  fall  just  under  .3  (or 
30  percent)  for  each  truncation  level.  The  best  demographic  model  accounts  for  nearly  12  to  16 
percent  of  the  theoretical  maximum  explainable  variation  in  costs  for  HMO- A.  The  best  of  the 
ADG  models  accounts  for  30  to  50  percent  of  that  variation  across  truncation  levels.  As  a  practical 
matter,  the  more  useful  contribution  of  including  this  measure  is  that  it  provides  an  alternative  scale 
to  see  the  differences  in  relative  improvement  across  risk  adjustment  methods  and  truncation  levels. 
The  amount  of  variation  explained  by  the  simplest  models  improves  by  a  third,  while  results  for  the 
ADG  models  improve  by  two  thirds  from  highest  to  lowest  truncation  level. 

Adjusted  R2  values  derived  from  the  validation  sample,  and  reported  in  section  ( c )  of  Table  5.1, 
were  essentially  the  same  or  slightly  lower  than  those  for  the  estimation  sample  in  HMO-A.  There 
appears  to  be  a  more  precipitous  drop  in  those  measures  for  the  ACG  and  ADG  models. 
Differences  between  the  chronic  flag  and  ACG  models  seem  to  flatten  out  at  the  highest  truncation 
level  ($50,000).  The  more  typical  relationship  across  the  four  basic  methods  was  restored  at  the 
$25,000  level.  Again,  HMO-B  exhibited  nearly  the  same  pattern  of  differences  between  estimation 
and  validation  sample  results,  although  a  few  values  are  higher  for  the  validation  sample  in  that 
plan  (see  Appendix  B).  One  notable  exception  to  the  typical  pattern  for  validation  (versus 
estimation)  sample  results  is  that  the  R2  values  for  the  ACG  model  are  higher  for  the  validation 
sample  in  HMO-B  at  each  stoploss  level.  The  direction  of  relative  differences  across  the  four  basic 
risk  adjustment  methods  remained  the  same.  Once  again,  these  findings  suggest  that  alternative 
risk  adjustment  methods  perform  consistently,  in  terms  of  variation  explained  at  the  individual 
level,  across  stoploss  levels  and  across  comparable  health  plans. 

Predictive  Accuracy 

Results  shown  in  sections  (d)  through  (g)  of  Table  5. 1  reflect  individual-level  measures  of 
predictive  accuracy  based  on  the  validation  sample  for  HMO-A.  The  mean  absolute  error  of  the 
predicted  from  actual  values  is  shown  in  section  (d).  Those  means  are  similar  across  risk 
adjustment  methods  for  any  given  truncation  level,  although  the  direction  of  differences  that  do 
exist  comply  with  the  respective  R2  results.  There  is  a  more  pronounced  effect  on  the  reduction  of 
error  by  truncation  level,  than  there  is  across  adjustment  methods.  To  put  this  in  some  perspective, 
the  difference  in  mean  absolute  error  between  the  simple  demographic  and  ADG  models  is  on  the 
order  of  $4  to  $5  per  person  at  any  truncation  level.  The  absolute  error  is  reduced  by 
approximately  a  third  ($30  per-member-per-month)  from  the  highest  to  the  lowest  truncation  level, 
regardless  of  risk  adjustment  method.  Thus,  the  choice  of  truncation  level  has  a  much  more 
pronounced  effect  on  the  reduction  of  overall  error  that  is  subsequently  subject  to  risk  adjustment 
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than  does  the  choice  of  risk  adjustment  methods  used  to  distribute  the  remaining  error.  This, 
perhaps  obvious,  finding  serves  to  illustrate  the  point  that  risk  limitation  removes  costs  from  the 
process  of  risk  adjustment.  It  does  not,  in  itself,  contribute  to  the  process  of  risk  adjustment.  Risk 
limitation  and  risk  adjustment  are  independent  elements  of  a  larger  process  used  to  control  financial 
risk. 

Sections  (e)  and  (f)  of  the  table  suggest  the  extent  to  which  the  absolute  error  associated  with 
individual  cases  centers  around  perfect  prediction,  or  no  error  between  predicted  and  actual  costs. 
The  first  of  these  measures,  the  percent  of  absolute  error  within  $25,  follows  the  same  general 
pattern  as  the  R2  results  in  section  ( c )  at  the  $50,000  stoploss  level.  Percentages  increase  from  the 
demographic,  to  the  chronic  flag,  to  the  ADG  models.  Differences  between  the  chronic  flag  and 
ACG  models  flatten  out.  The  percentages  remain  much  the  same  across  truncation  levels,  except 
those  for  the  ACG  model.  The  jump  in  percentages  for  the  ACG  model  between  a  $25,000  and 
$10,000  stoploss  level  restores  the  "usual  pattern"  of  results  across  all  methods.  Generally,  ACGs 
seem  more  sensitive  to  truncation  level  on  this  measure,  particularly  at  lower  stoploss  levels.  The 
relative  improvement  with  lower  truncation  levels  for  ACGs  is  also  evident  in  the  results  for  HMO- 
B.  These  patterns  largely  reinforce  the  hypothesis  that  risk  adjustment  methods  perform 
consistently  across  alternative  measures  of  performance  and  across  comparable  health  plans. 

The  percent  of  absolute  error  within  $50  reflects  a  broader  band  of  error  than  the  previous  measure. 
Thus,  the  associated  percentages  are  predictably  higher.  The  direction  of  differences  follows  the 
usual  pattern  between  the  demographic  or  chronic  flag  models  and  the  ADG  models.  However, 
ACGs  have  a  higher  percentage  on  this  measure  than  any  of  the  other  models  at  each  truncation 
level.  Although  the  differences  are  admittedly  slight,  and  especially  as  compared  to  ADGs,  ACGs 
seem  to  limit  the  spectrum  of  error  a  little  more  narrowly,  or  at  least  differently,  than  other 
methods.  The  results  for  HMO-B  revealed  a  clearer,  and  more  typical,  distinction  between  the 
demographic  and  chronic  flag  models.  Otherwise,  the  pattern  of  results  was  the  same  on  this 
measure  across  the  two  plans,  including  those  for  ACGs. 

The  percent  of  absolute  error  more  than  $400  reflects  the  extreme  tail  of  the  distribution  of 
prediction  error.  There  is  more  consistency  with  the  usual  pattern  of  results  across  risk  adjustment 
methods  on  this  measure,  than  was  the  case  for  the  previous  two  measures.  This  measure  goes  up 
slightly  with  lower  stoploss  levels  (above  $5,000),  presumably  because  the  actual  values  for 
individuals  decrease  more,  on  average,  than  the  expected  values  as  the  threshold  level  for  truncation 
goes  down.  Since  this  measure  essentially  involves  cases  with  losses  in  excess  of  $4,800  per  year 
($400  *  12  months),  there  is  a  noticeable  drop  in  percentages  at  the  $5,000  stoploss  level.  If  the 
reinsurance  coverage  did  not  include  coinsurance  above  the  stoploss,  this  measure  would  be  very 
close  to  0.0  percent  for  any  model  at  that  level.  The  better  performance  of  ACGs  with  respect  to 
error  within  $50  disappears  at  this  extreme  end  of  the  continuum  of  absolute  error. 
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The  findings  based  on  individual-level  measures  of  predictive  accuracy  generally  suggest,  again, 
that  risk  adjustment  methods  perform  consistently  across  comparable  health  plans  and  across 
stoploss  levels.  There  is  some  inconsistency  evident  across  risk  adjustment  methods  in  these 
results,  particularly  with  respect  to  the  sensitivity  of  ACGs  to  truncation  level  and  ADGs  applied  to 
the  validation  sample  in  HMO-A,  but  the  overall  pattern  of  the  findings  reinforce  the  prior 
implications  drawn  from  measures  of  variation  explained  (the  adjusted  R2s)  at  the  individual  level. 

Other  Criteria 

There  are  considerable  counterbalancing  effects  associated  with  any  improvements  in  predictive 
accuracy  that  are  the  result  of  applying  increasingly  lower  stoploss  levels.  While  this  study 
primarily  focuses  on  direct  measures  of  model  performance  related  to  the  application  of  reinsurance 
and  risk  adjustment  methods,  it  is  important  to  consider  such  effects  in  an  overall  assessment  of  the 
study's  more  central  results.  This  is,  perhaps,  most  clearly  the  case  with  respect  to  the 
administrative  feasibility  of  choosing  between  stoploss  levels.  As  discussed  in  Chapter  2,  the 
administrative  burden  associated  with  stoploss  coverage  is  related  to  the  number  of  cases  that  meet 
the  reinsurance  threshold.  As  more  cases  are  involved,  more  work  is  required  of  the  reinsuring 
entity  to  adjudicate  claims.  There  is  more  intrusion  into  the  operation  of  the  reinsured  plan  on  the 
part  of  the  reinsurer  to  effect  that  adjudication.  There  are  also  fewer  dollars  remaining  that  are 
subject  to  risk  adjustment,  thus  removing  some  level  of  incentive  to  control  costs  while  providing 
care. 

Table  5.2  reflects  several  measures  related  to  the  number  of  cases,  and  the  dollars  associated  with 
those  cases,  truncated  at  each  stoploss  level.  Both  HMO-A  and  HMO-B  are  included.  This  table  is 
presented  to  provide  some  sense  of  the  scale  of  differences  involved  at  each  stoploss  level.  Row  (e) 
of  the  table,  for  example,  can  be  associated  with  the  level  of  dollars  that  would  be  recovered  at  each 
stoploss  level  and,  by  further  inference,  the  cost  of  coverage.  In  actual  practice,  the  measures  in 
this  table  are  likely  to  vary  by  plan,  as  they  do  in  this  table,  and  by  type  of  plan.  The  cost  of  the 
coverage  would  also  reflect  other  factors,  such  as  the  reinsuring  entity's  full  book-of-business  and 
its  willingness  to  assume  various  levels  of  risk.  Nevertheless,  Table  5.2  shows  that  going  from  a 
$50,000  stoploss  threshold  to  one  at  $10,000  involves — on  the  order  of — a  tend-fold  increase  in 
cases  covered.  Recoverable  costs,  in  row  (e),  go  up  at  roughly  half  that  rate,  and  the  administrative 
costs  of  adjudicating  claims  would  go  up  substantially  at  the  lower  level.  Going  to  a  stoploss  level 
of  $5,000  increases  the  number  of  cases  involved  exponentially.  Recoverable  costs  go  up  at  a 
slower  rate  than  the  increase  in  the  number  of  cases  at  each  successively  lower  stoploss  level. 
Gameability  is  also  an  issue  in  the  context  of  reinsurance.  A  stoploss  threshold  may  serve  as  a 
target  above  which  providers  begin  to  lose  interest,  or  at  least  the  incentive,  to  provide  care 
efficiently.  While  coinsurance  is  intended  to  help  retain  some  incentive  above  the  stoploss 
threshold,  the  measures  in  Table  5.2  suggest  that  just  under  a  third  of  total  service  costs  may  fall 
outside  the  direct  concern  of  providers  at  a  $5,000  stoploss  level.  In  the  absence  of  the  10  percent 
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Table  5.2:      Numbers  and  Costs  of  Truncated  Cases 


Stoploss  level  : 

$50k 

Stoploss  level 

Measure 

$50k 

$25k  $I0k 

$5k 

$25k 

$10k 

S5k 

HMO- A       ($81  untruncated) 

HMO-B 

($93  untruncated) 

(a) 

N 

22,335 

22,335  22,335 

22,335 

17,689 

17,689 

17,689 

17,689 

(b) 

SPMPM 

$78 

$74  $67 

$59 

$91 

$86 

$76 

$64 

(b)  as  %  untrnc'd 

95.5% 

91.1%  82.5% 

72.0% 

96.8% 

91.8% 

81.5% 

68.6% 

(c) 

#  truncated 

25 

72  275 

947 

24 

77 

317 

963 

(d) 

$  per  case 

$3,294 

$2,248  $1,153 

$535 

$2,236 

$1,752 

$964 

$540 

(e) 

SperN 

$4 

$7  $14 

$23 

$3 

$8 

$17 

$29 

(e)  as  %  untrnc'd 

4.5% 

8.9%  17.5% 

28.0% 

3.2% 

8.1% 

18.5% 

31.4% 

coinsurance  included  in  this  table,  recoverable  costs  would  increase  a  little  more  than  10  percent 
across  stoploss  levels,  and  assumable  per-member-per-month  costs,  in  row  (b),  would  go  down 
slightly.  Any  incentive  to  providers  embodied  in  the  copayment  would  also  be  removed. 

Hypotheses 

With  respect  to  the  first  set  of  hypotheses  posed  for  this  study,  the  analysis,  so  far,  suggests  that 
there  is  a  generally  predictable,  or  "usual",  pattern  of  relative  performance  across  the  risk 
adjustment  methods  examined  here.  There  is  a  consistent  pattern  of  improvement  going  from 
models  based  on  simple  demographics,  to  a  chronic  flag,  to  ACGs,  and  to  ADGs.  That  pattern  is 
largely  evident  across  individual-level  measures,  and  across  study  sites.  There  is,  accordingly,  no 
reason  to  reject  hypothesis  la. 

Measures  of  model  performance  generally  improve  with  increasingly  lower  stoploss  thresholds. 
One  exception  is  that  the  absolute  error  associated  with  extreme  cases  (more  than  $400)  is  largely 
constant,  or  declines,  across  relevant  stoploss  levels.  However,  that  is  most  likely  an  artifact  of  the 
measure  itself  in  that  the  same  high-cost  cases  are  included  in  its  calculation  at  all  stoploss  levels. 
Thus,  there  is  no  preponderance  of  evidence  to  reject  hypothesis  lb. 

There  is  some  evidence  to  reject  hypothesis  lc.  The  relative  difference  between  alternative  methods 
increases — measured  in  terms  of  adjusted  R2 — rather  than  decreases,  at  increasingly  lower  stoploss 
levels.  Moreover,  the  relative  performance  of  ACGs  seems  to  be  more  sensitive  to  truncation  level 
in  terms  of  results  based  on  the  validation  samples,  although  the  extent  of  any  differences  is  not 
clear.  Higher  stoploss  levels  obscure  differences  between  the  chronic  flag  and  ACG  models  in  a 
narrow  band  of  predictive  error  ($25),  and  ACGs  generally  show  more  marked  improvement  in 
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measures  of  predictive  accuracy  across  stoploss  levels  than  do  the  other  methods.  Since  ADGs 
seem  to  perform  unusually  poorly  when  they  were  applied  to  the  validation  sample  in  HMO- A,  it  is 
not  clear  whether  the  general  inconsistencies  in  these  results  are  attributable  to  changes  in  stoploss 
level,  the  risk  adjustment  methods,  or  some  other  source.  Nevertheless,  these  findings  seem  to 
suggest  that  increasingly  lower  truncation  levels  tend  to  exaggerate,  rather  than  moderate, 
differences  in  performance  between  risk  adjustment  alternatives,  as  measured  at  the  individual 
level.  The  implication  of  this  finding  is  that  removing  variability  related  to  high-end  costs  makes  a 
choice  of  risk  adjustment  methods  more,  rather  than  less,  distinct.  Therefore,  the  choice  of  risk 
adjustment  methods  is  more,  rather  than  less,  important — with  respect  to  those  dollars  that  are 
subject  to  risk  adjustment — as  the  stoploss  level  is  lower. 

ADDITIONAL  DATA  TREATMENTS 

Hypothesis  2a:  Power  calculations  derived  from  Box-Cox  methodology  will  indicate 
that  transformation  of  the  dependent  measure  (total  service  costs)  is  appropriate. 

Hypothesis  2b:  Increasingly  complex  treatment  of  the  dependent  measure  will  increase 
the  R2  of  each  risk-adjustment  model  at  each  level  of  reinsurance. 

The  second  set  of  hypotheses  outlined  in  Chapter  3  was  proposed  to  examine  the  effects  of  stoploss 
levels  in  the  context  of  other  data  treatments  typically  applied  when  health  service  cost  data  are 
used  to  establish  expected  cost  values.  As  discussed  in  an  earlier  chapter,  the  singular  classic 
discussion  of  the  general  issue  of  modeling  health  cost  expectations  was  presented  by  Duan  et  al.  in 
1982.  That  study  suggested  that  the  distribution  of  health  service  costs  were  such  that  the  log  of 
those  data  would  provide  better  expectations  than  the  same  data  on  their  original  scale  (actual 
dollars).  In  order  to  deal  with  the  fact  that  there  is  no  log  of  0  (zero),  Duan  and  his  colleagues 
suggested  that  parameter  estimates  for  risk  factors  should  be  derived,  first,  from  those  who  actually 
used  services.  Those  estimates  should  then  be  adjusted  for  the  probability  of  service  use.  In 
addition,  they  proposed  treating  individuals  who  have  any  inpatient  expenses  independently  of 
those  who  do  not,  to  account  for  the  differences  in  distribution  of  inpatient  and  ambulatory  costs. 
Power  calculations  discussed  in  Chapter  4  indicated  that  log  transformation  was  appropriate  given 
the  distribution  of  total  service  costs  used  in  this  study. 

The  analysis  in  this  section  indicates  that,  contrary  to  hypothesis  2b,  increasingly  complex 
treatments  of  the  dependent  variable  do  not  clearly  improve  model  performance  in  terms  of 
individual-level  measures.  Adjusted  R2  values  derived  from  models  using  the  best  of  the  multipart 
log  treatments  are  no  higher,  or  slightly  lower,  than  the  same  measures  derived  using  actual-dollar 
amounts.  Measures  of  predictive  accuracy  are  largely  consistent  with  that  assessment.  There  is 
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some  indication,  in  terms  of  both  variation  explained  and  measures  of  predictive  accuracy,  that  the 
choice  of  data  treatment  affects  the  underlying  expectations  for  the  ACG  and  ADG  models.  It  is 
not  clear  how  significant  those  effects  are. 

Variation  Explained 

Cost  expectations  were  derived  for  this  study  using  one,  two,  and  four-part  log-based  models 
following  the  general  techniques  described  by  Duan  and  his  colleagues.  Table  5.3  shows  the 
adjusted  R2  values  associated  with  the  actual-dollar  model  discussed  earlier  and  each  of  the  log- 
model  applications,  based  on  the  estimation  sample  for  HMO-A.  The  results  for  the  actual-dollar 
and  the  four-part  log  treatment  based  on  the  validation  sample  are  also  shown.  Among  the  log 
models  in  sections  (b),  (c),  and  (d)  of  that  table,  results  improved  in  a  predictable  pattern  with  the 
complexity  of  the  data  treatment.  However,  the  actual-dollar  model  produced  generally  better 
results  than  any  of  the  log-model  alternatives.  The  four-part  log  version  produced  essentially  the 
same,  or  slightly  lower,  adjusted  R2  values  as  the  actual-dollar  model  across  risk  adjustment 
methods. 

It  is  important  to  note  that  the  R2  values  for  the  log  models  were  calculated  directly  using  the  initial 
untransformed  dollar  values  of  actual  costs  for  any  given  set  of  individuals  and  expectations 
transformed  to  the  original  dollar  scale.  R2  values  generated  during  the  regression  calculations  for 
the  log  models  (those  that  appear  in  the  computer-generated  software  output)  were  consistently 
higher  than  those  based  on  untransformed  dollars,  but  the  log-based  values  reflect  variation 
explained  on  the  log  scale.  In  the  case  of  the  2  and  4-part  models,  the  computer-generated  R2 
values  also  reflect  only  those  who  actually  used  services.  R2  values  for  log-derived  expectations 
that  have  been  transformed  to  the  original  dollar  scale  have  rarely,  if  ever,  been  reported  in  the 
literature  on  risk  adjustment.  Thus  the  values  reported  in  Table  5.3  may  seem  strange  to  those  who 
are  accustomed  to  seeing  such  values  based  on  the  log  scale  alone.  All  of  the  values  reported  in 
Table  5.3  were  calculated  directly  from  untransformed-actual  and  dollar-transformed-expected 
values  using  the  same  formulation.  As  a  check  on  the  validity  of  the  calculations,  R2  values  for  the 
actual-dollar  models  were  compared  to  those  generated  by  the  statistical  software  used  for  the 
regression  analyses.  The  software-generated  values  were  the  same  as  the  directly  calculated  values 
to  the  fifth  significant  digit.  Examples  of  log-scale  R2  values  can  be  seen  in  the  sample  computer 
software  output  included  in  Appendix  B. 

While  R2  values  were  lower  for  the  four-part  treatment  as  opposed  to  using  actual  dollars,  section 
(f)  of  the  table  shows  that  there  was  a  less  substantial  drop  in  explained  variation  when  the 
expectations  were  applied  to  the  validation  sample,  than  was  apparent  using  actual  dollars.  One 
implication  of  this  finding  is  that  there  may  be  some  level  of  bias  in  the  underlying  expectations 
related  to  the  choice  of  data  treatment,  although  the  extent  of  any  bias  is  difficult  to  discern  in  these 
data.  Comparable  results  shown  in  Appendix  B  show  that  HMO-B  also  exhibits  the  same,  or 
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Table  5.3:   Adjusted  R-square  (HMO-A) 


Stopfess 


level 

A_C{1) 

A_G<2} 

CHR(1)  [ 

CHR<2)  | 

ACG 

ADG(l)  | 

A»G<2) 

One-part  model  using  actual  dollars 

Estimation  Half 

50k 

0.034 

0.027 

0.048 

0.043 

0.062 

0.084 

0.080 

0.041 

0.034 

0.058 

0.053 

0.079 

0.104 

0.100 

10k 

0.044 

0.039 

0.066 

0.064 

0.102 

0.127 

0.124 

5k 

0.046 

0.043 

0.072 

0.071 

0.118 

0.145 

0.143 

One-part  model  using  log  transformation 

Estimation  Half 

50k 

0.028 

0.021 

0.045 

0.039 

0.052 

0.014 

0.008 

25k 

0.034 

0.027 

0.056 

0.049 

0.067 

0.016 

0.009 

10k 

AVIV 

0.032 

0.033 

0.061 

0.060 

0.080 

-0.015 

-0.014 

5k 

0.028 

0.036 

0.062 

0.067 

0.084 

-0.049 

-0.039 

Two-part  model  using  log  transformation 

Estimation  Half 

50k 

0.029 

0.023 

0.045 

0.040 

0.057 

0.059 

0.054 

25k 

0.036 

0.029 

0.056 

0.050 

0.075 

0.075 

0.070 

10k 

0.036 

0.035 

0.063 

0.062 

0.094 

0.079 

0.079 

5k 

0.035 

0.039 

0.067 

0.070 

0.106 

0.076 

0.082 

Four-part  model  using  log  transformation 

Estimation  Half 

(d) 

0.033 

0.024 

0.049 

0.043 

0.060 

0.061 

0.069 

50k 

0.040 

0.031 

0.060 

0.054 

0.078 

0.081 

0.090 

25k 

0.042 

0.038 

0.067 

0.065 

0.100 

0.112 

0.116 

10k 

0.043 

0.042 

0.072 

0.072 

0.116 

0.131 

0.135 

5k 


One-part  model  using  actual  dollars 

Validation  Half 

50k 

0.029 

0.024 

0.040 

0.037 

0.042 

0.065 

0.063 

25k 

0.037 

0.032 

0.051 

0.048 

0.061 

0.085 

0.083 

10k 

0.045 

0.040 

0.065 

0.063 

0.086 

0.111 

0.110 

5k 

0.047 

0.043 

0.071 

0.070 

0.097 

0.126 

0.124 

Four-part  model  using  log  transformation 

Validation  Half 

50k 

0.028 

0.023 

0.041 

0.036 

0.043 

0.061 

0.061 

25k 

0.036 

0.030 

0.053 

0.049 

0.062 

0.079 

0.078 

10k 

0.043 

0.040 

0.066 

0.065 

0.086 

0.104 

0.105 

5k 

0.046 

0.044 

0.072 

0.071 

0.097 

0.118 

0.119 
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lower,  R2  values  using  the  four-part  log  treatment  as  opposed  to  using  actual  dollars.  Except  for 
what  may  be  anomalously  low  results  for  ADGs  at  higher  stoploss  levels  in  HMO- A,  there  seem  to 
be  no  differences  across  the  actual-dollar  and  log  models  related  to  stoploss  level  in  either  plan. 

Predictive  Accuracy 

Individual-level  measures  of  predictive  accuracy  drawn  from  the  log  models  that  are  not  presented 
here,  but  that  are  included  in  Appendix  B,  generally  suggest  the  same  results  as  those  based  on 
adjusted  R2  values.  The  pattern  of  mean  absolute  error  does  not  seem  to  be  affected  by  data 
treatment  model  type.  The  four-part  model  produces  very  nearly  the  same,  but  generally  poorer, 
results  than  the  actual-dollar  model  given  each  of  the  remaining  measures  of  predictive  accuracy 
discussed  so  far.  The  other  log  models  produce  even  poorer  results. 

The  better  performance  of  ACGs  over  ADGs  with  respect  to  the  percent  of  absolute  error  within 
$50  was  not  evident  using  the  four-part  log  treatment.  Thus,  the  "usual"  pattern  of  relationships 
across  risk  adjustment  methods  was  restored.  As  was  the  case  in  terms  of  explained  variation  with 
ADGs  applied  to  the  validation  sample  in  HMO- A,  this  finding  suggests  that  the  choice  of  data 
treatments  has  some  effect  on  the  bias  associated  with  underlying  expectations.  Once  again,  the 
extent  of  any  bias  is  not  readily  apparent  from  these  individual-level  results.  Also  as  was  the  case 
for  the  adjusted  R2  results,  there  were  no  pronounced  effects  related  to  stoploss  level  across  data 
treatment  model  type.  Similar  patterns  on  each  of  these  measures  are  evident  from  the  data  for 
HMO-B. 

Hypotheses 

As  noted  at  the  beginning  of  this  section  and  discussed  in  Chapter  4,  log  transformation  appeared 
to  be  appropriate  on  a  statistical  basis  for  the  data  in  this  study.  Thus,  there  was  no  evidence  to 
reject  hypothesis  2a.  There  was  some  limited  evidence  of  bias  in  the  underlying  expectations  that 
seems  to  be  related  to  the  choice  of  data  treatments.  However,  there  was  considerable  evidence  to 
suggest  that  data  transformation  and  multipart  modeling  do  not  substantially  improve  on  one-part 
models  using  actual-dollar  amounts  given  individual-level  measures  of  model  performance. 
Models  based  on  actual  dollars  produced  consistently  better  results  in  terms  of  adjusted  R2.  That 
evidence  was  largely  supported  by  individual-level  measures  of  predictive  accuracy.  Hypothesis  2b 
can  not  be  reasonably  supported,  given  this  evidence  alone. 

It  is  worth  noting  that  there  seems  to  be  an  emerging  trend  not  to  use  multipart  log  models  to 
establish  cost  expectations  in  more  recent  studies  of  risk  adjustment  methods,  such  as  those  from 
Hornbrook  and  Goodman  (1995)  and  Dunn  et  al.  (1995).  At  the  same  time,  very  recent  reviews  of 
risk  adjustment  methods  and  techniques,  such  as  Shwartz  and  Ash  (1994),  continue  to  acknowledge 
the  theoretical  potential  of  transforming  the  underlying  values  used  to  estimate  costs.  The  findings 
so  far  in  this  study  are  most  interesting,  perhaps,  in  that  they  provide  formal  evidence  that  is  not 
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currently  available  in  related  literature  for  the  trend  away  from  the  use  of  log  transformation.  One 
possible  explanation  for  these  results  is  that  the  process  of  re-transforming  log  expectations  to  the 
original  dollar  scale,  and  smearing  some  portion  of  "lost"  dollars  across  those  expectations, 
obscures  any  contribution  to  the  reduction  of  bias  in  the  expectations  gained  by  deriving  them  on  a 
log  scale.  Alternatively,  however,  it  may  be  that  individual-level  results  do  not  adequately  reflect 
bias  that  is  more  generally  a  population-level  effect. 

GROUP-LEVEL  ANALYSIS 

Hypothesis  3a:  Group-level  measures  of  predictive  accuracy  will  improve  as  the 
maximum  value  associated  with  individual-level  stoploss  reinsurance  decreases. 

Hypothesis  3b:  Group-level  measures  of  predictive  accuracy  will  improve  with 
increasingly  complex  treatment  of  the  dependent  measure. 

The  third  set  of  hypotheses  proposed  in  Chapter  3  addressed  the  relationship  between  measures  of 
model  performance  based  on  individual-level  data  and  measures  based  on  assessment  at  the  group 
level.  There  is  a  general  assumption,  as  evidenced  in  the  prominence  of  the  R2  in  related  published 
analyses,  that  the  relative  performance  of  risk  adjustment  methods — as  measured  at  the  individual 
level — will  be  much  the  same  at  the  group  level.  This  assumption  is  supported,  in  part,  by  basic 
statistical  principles  that  state  that  the  mean  of  the  distribution  of  sample  means  is  the  same  as  the 
mean  of  the  underlying  population  of  individual  measures,  and  that  the  variance  of  the  sample  mean 
is  related  to  the  variation  in  the  underlying  measures  by  the  size  of  the  sample.  The  larger  the 
sample,  the  better  the  mean  of  that  sample  approximates  the  true  mean  of  the  underlying 
distribution.  It  is  also  important  to  remember  that  the  mean  of  a  sample  drawn  from  a  population 
of  measures  that  is  not  normally  distributed  will  reflect  bias  inherent  in  that  distribution. 

The  analysis  reported  here  is  intended  to  examine  the  scale  of  the  relationship  between  individual 
and  group-level  results.  The  assumption  underlying  hypothesis  3a  is  that  improvements  associated 
with  increasingly  lower  stoploss  levels  noted  at  the  individual  level  will  be  similarly  reflected  in 
group-level  results.  Hypothesis  3b  was  based  on  the  prior  assumption  underlying  hypothesis  2b 
that  multipart  log  models  would  fare  better,  relative  to  the  actual-dollar  model.  Although  that  prior 
hypothesis  was  not  supported  by  the  individual-level  results,  the  log-based  results  are  included  in 
this  analysis  to  help  shed  light  on  the  relationship  between  individual  and  group-level  measures. 

The  analysis  in  this  section  indicates  that  group-level  measures  reveal  bias  in  underlying 
expectations  related  to  the  choice  of  data  treatments  more  clearly  than  do  individual-level  measures. 
Moreover,  group-level  analysis  can  be  used  to  estimate  the  sum  of  bias  that  is  not  addressed  in  the 
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application  of  risk  adjustment  as  a  whole,  including  bias  that  is  not  necessarily  discernable  using 
individual-level  measures.  The  pattern  of  group-level  measures  across  risk  adjustment  methods 
should  be  able  to  be  used  to  estimate  the  extent  of  bias  from  other  sources  that  is  evident  across 
otherwise  comparable  populations. 

Individual  plan  members  were  aggregated  into  groups  based  on  their  primary  care  provider  (PCP) 
assignment  in  each  plan.  Bootstrap  sampling  at  the  PCP  level  was  used  to  create  60  groups,  each, 
of  four  different  relative  sizes  (500,  1500,  3000,  and  5,000).  Expected  and  actual  cost  values  were 
aggregated  to  the  group  level,  and  a  predictive  ratio  was  calculated  such  that  values  over  1 .00 
indicate  that  expectations  were  greater  than  actual  costs. 

Three  measures  of  predictive  accuracy  at  the  group  level  are  included  in  this  analysis.  The  mean 
forecasting  bias  is  examined  as  the  percentage  difference  between  predicted  and  actual  costs.  For 
any  given  group,  that  is  the  primary  predictive  ratio  minus  1 .00.  A  negative  bias  indicates  that  a 
group  would  receive  less,  on  average,  than  its  related  actual  costs  if  the  expectations  were  used  as 
the  sole  basis  for  reimbursement.  Duan  et  al.  (1982)  described  this  as  a  measure  of  the  overall 
accuracy  of  a  given  risk  adjustment  model.  It  shows  how  far,  and  in  what  direction,  the  underlying 
expectations  fall  relative  to  actual  costs.  The  mean  squared  forecasting  error  is  the  mean  of  the 
square  of  the  forecasting  bias.  It  reflects  the  dispersion  of  error  around  mean  error.  Duan  and  his 
colleagues  described  this  measure  as  reflecting  the  adequacy  of  the  underlying  expectations  because 
measures  closer  to  0.0  (zero)  suggest  more  even  distribution  of  error,  without  respect  to  how 
accurate  a  method  is.  The  percentage  of  groups  within  5  percent  of  a  perfect  predictive  ratio  ( 1 .00) 
is  meant  to  suggest  the  effects  each  method  has  in  a  more  tangible  way  than  the  first  two  measures. 

Within  the  context  of  the  more  general  issue  of  the  relationship  between  individual  and  group-level 
measures,  there  are  three  primary  contributing  factors  that  are  addressed  in  this  analysis.  One  is  the 
stoploss  level,  which  is  the  nominal  focus  for  this  study.  Another  is  the  choice  of  data  treatments 
underlying  the  expected  values.  The  third  is  the  relative  size  of  the  groups. 

Stoploss  Level 

The  analysis  of  individual-level  measures  earlier  in  this  chapter  indicated  that  there  was  a  generally 
predictable  pattern  of  results  across  risk  adjustment  methods,  that  those  results  improved  with 
successively  lower  stoploss  levels,  and  that  the  absolute  difference  across  methods  increased  at 
lower  levels.  HMO-B  exhibited  the  same  pattern  of  results  as  did  HMO-A.  As  discussed  more 
fully  at  the  end  of  Chapter  2,  basic  statistical  principles  suggest  that,  since  expectations  are 
essentially  sample  mean  values  from  some  larger  population  of  possible  values,  there  should  be 
some  identifiable  relationship  between  individual  and  group  level  results,  and  that  relationship 
should  involve  the  size  of  the  groups. 
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Table  5  .4  shows  each  of  the  group-level  measures  of  predictive  accuracy  for  the  actual -dollar 
model,  given  groups  of  3,000  plan  members,  for  both  HMO-A  and  HMO-B.  Because  these 
measures  are  derived  from  the  validation  sample,  they  are  most  directly  comparable  to  the  other 
results  reported  for  those  samples.  Given  the  fact  that  expectations  were  drawn  from  the  estimation 
samples,  these  measures  can  also  be  meaningfully  compared  to  findings  for  those  samples. 
Sections  (a)  through  (  c)  of  Table  5.4  are  the  companion  results  to  the  measures  reported  for  HMO- 
A  in  Table  5.1,  above.  Sections  (d)  through  (f)  can  be  compared  to  results  reported  for  HMO-B  in 
Appendix  B,  although — again — the  patterns  of  individual-level  results  were  generally  the  same  for 
both  plans. 

HMO-A.  There  was  general  consistency  of  the  group-level  results  with  the  pattern  of  individual- 
level  results,  for  HMO-A.  The  percentage  of  groups  within  5  percent  of  actual,  in  particular, 
followed  the  "usual"  pattern  of  results  described  earlier.  The  percentage  of  groups  within  5  percent 
of  actual  costs  went  up  from  the  demographic,  to  the  chronic  flag,  to  the  ACG,  to  the  ADG  models, 
and  it  goes  up  as  the  stoploss  level  goes  down.  There  are  notable  differences  between  the  other 
group-level  measures  and  the  earlier  results.  For  example,  mean  forecasting  bias  went  up  rather 
than  down  at  the  lowest  stoploss  level.  Mean  bias  was  smaller,  and  therefore  better,  for  ACGs  than 
for  ADGs,  despite  an  approximately  2  point  advantage  for  ADGs  given  the  R2  results.  The  mean 
squared  forecasting  error  was  essentially  the  same  for  both  of  these  methods.  There  is  no  formal 
statistical  measure  to  compare  differences  in  mean  squared  error,  although  a  lower  value  is 
generally  better.  Tests  were  conducted  to  assess  the  statistical  significance  of  results  for  HMO-A 
for  the  other  two  measures.  Tables  reflecting  the  patterns  of  significance  discussed  in  this  section 
are  included  in  Appendix  B.  Unless  otherwise  noted,  significance  is  reported  at  a  .01  level. 

Two-tail  t-tests  of  paired  differences  were  used  on  mean  forecasting  bias  reported  in  Table  5.4  to 
assess  whether  the  overall  differences  evident  in  that  measure  across  risk  adjustment  methods  were 
statistically  significant,  in  terms  of  the  underlying  difference  in  the  measures  for  any  particular 
group.  Two  specific  models  were  tested  at  any  one  time,  rather  than  testing  all  the  models  in  one 
test,  in  order  to  isolate  the  differences  between  models.  For  example,  the  mean  forecasting  bias 
associated  with  the  continuous  version  of  the  ADG  model  was  compared  to  the  same  measure  for 
the  categorical  ADG  model.  There  was  a  significant  difference  in  the  mean  forecasting  bias,  for 
any  given  method,  between  pairs  of  models  with  age  defined  as  a  continuous  versus  a  categorical 
variable  (at  the  .05  level  for  the  chronic  flag  models),  at  the  $50,000  stoploss  level.  That 
significance  disappears  for  the  chronic  flag  models  at  the  $25,000  stoploss  level,  and  for  the 
demographic  model  at  the  $10,000  level.  It  remained  at  all  levels  for  the  ADG  model  pairs. 

There  were  also  significant  differences,  in  measures  of  mean  forecasting  bias,  between  each  pair  of 
the  four  basic  methods  on  this  measure,  except  between  ACGs  and  the  ADG  models.  That  pattern 
was  the  same  across  stoploss  levels,  including  the  lack  of  differences  between  ACGs  and  ADGs. 
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Table  5.4:   Group-Level  Measures  by  Stoploss  Level,  using  untransformed  dollars 

StOpfcm 

JeveO  A  GQ)  \  AGO)  1  CHRQ)  ■  CHR<2)  |    ACG     j  ADG(I)  I  ADG<2) 


(a) 


Mean  Forecasting  Bias 


HMO-A 


(b) 


(c) 


50k 

-8.5% 

-8.8% 

-7.1% 

-7.3% 

-5.5% 

-5.6% 

-5.8% 

25k 

-6.0% 

-6.2% 

-4.6% 

-4.7% 

-2.9% 

-3.1% 

-3.3% 

10k 

-4.2% 

-4.3% 

-2.9% 

-2.9% 

-1.2% 

-1.5% 

-1.6% 

5k 

-4.5% 

-4.5% 

-3.2% 

-3.2% 

-1.7% 

-1.9% 

-1.9% 

mean  squared  forecasting  terror 

50k 

0.9% 

1.0% 

0.7% 

0.7% 

0.5% 

0.5% 

0.6% 

25k 

0.5% 

0.6% 

0.4% 

0.4% 

0.3% 

0.3% 

0.3% 

10k 

0.3% 

0.3% 

0.2% 

0.2% 

0.2% 

0.2% 

0.2% 

5k 

0.3% 

0.3% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

Percent  of  Groups  Within  5  Percent  of  Actual 

HMO-A 

50k 

23.3% 

23.3% 

31.7% 

28.3% 

35.0% 

43.3% 

41.7% 

25k 

41.7% 

38.3% 

55.0% 

56.7% 

61.7% 

65.0% 

63.3% 

10k 

53.3% 

56.7% 

66.7% 

65.0% 

78.3% 

80.0% 

78.3% 

5k 

53.3% 

51.7% 

70.0% 

70.0% 

81.7% 

85.0% 

81.7% 

(d) 


(e) 


(0 


Mean  Forecasting  Bias 

HMO-B 

50k 

-6.4% 

-5.9% 

-5.6% 

-5.0% 

-2.1% 

-6.8% 

-6.5% 

25k 

-8.5% 

-8.1% 

-7.8% 

-7.3% 

-4.6% 

-8.3% 

-8.1% 

10k 

-8.4% 

-8.0% 

-7.7% 

-7.4% 

-5.3% 

-7.8% 

-7.7% 

5k 

-7.3% 

-6.9% 

-6.7% 

-6.3% 

-4.6% 

-6.7% 

-6.6% 

Mean  Squared  Forecasting  Error 

IIMO-B 

50k 

0.6% 

0.6% 

0.5% 

0.5% 

0.3% 

0.7% 

0.7% 

25k 

0.9% 

0.8% 

0.8% 

0.7% 

0.4% 

0.9% 

0.8% 

10k 

0.8% 

0.8% 

0.7% 

0.7% 

0.4% 

0.7% 

0.7% 

5k 

0.6% 

0.6% 

0.5% 

0.5% 

0.3% 

0.5% 

0.5% 

Percent  of  Groups  Within  5  percent  of  Actual 

HMO-B 

50k 

30.0% 

33.3% 

31.7% 

38.3% 

58.3% 

26.7% 

31.7% 

25k 

21.7% 

20.0% 

21.7% 

23.3% 

45.0% 

13.3% 

18.3% 

10k 

16.7% 

20.0% 

21.7% 

25.0% 

41.7% 

16.7% 

15.0% 

5k 

26.7% 

30.0% 

28.3% 

31.7% 

58.3% 

30.0% 

30.0% 
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That  is  to  say  that,  even  though  there  were  significant  differences  between  the  two  versions  of  the 
ADG  model,  there  was  no  significant  underlying  difference  between  the  ACG  model  and  either  of 
the  ADG  models.  This  emphasizes  the  point  that  continuous  and  categorical  treatments  of  age  will 
produce  different  specific  expectations  for  any  given  risk  adjustment  method,  but  that  those 
differences  may  not  matter  when  comparing  those  expectations  with  other  methods.  From  a 
different  perspective,  the  absolute  differences  in  mean  bias  between  the  three  models  (there  is  less 
absolute  difference  between  the  two  ADG  models  than  between  the  ACG  model  and  either  of  the 
ADG  models  -  see  section  a  of  Table  5.4)  also  illustrate  the  fact  that  summary  measures  of 
performance  can  provide  misleading  indication  of  the  significance  of  differences  that  underlie  those 
measures  if  they  are  used  as  the  primary  criteria  for  assessing  relative  performance  across  risk 
adjustment  methods.  The  level  of  stoploss  did  not  affect  the  pattern  of  differences  between  risk 
adjustment  methods. 

A  McNemar  test  of  paired  proportions  was  used  to  assess  the  significance  of  the  percentage  of 
groups  within  5  percent  of  actual  costs.  In  this  case,  two  risk  adjustment  methods  might  lead  to  the 
same  proportion  of  groups  that  fall  within  some  band  of  error  and  still  be  statistically  different,  if 
different  groups  make  up  those  two  proportions.  Where  the  paired-t  tests  of  the  mean  forecasting 
bias  showed  significant  differences  between  otherwise  similar  models  with  continuous  versus 
categorical  treatments  of  age,  there  were  no  significant  differences  in  the  proportion  of  groups 
within  the  5  percent  band  for  those  pairs  of  models,  including  ADGs,  at  any  stoploss  level.  This 
seems  to  reinforce  the  suggestion  that  categorical  and  continuous  treatments  of  age  produce 
different  expectations  that  lead  to  the  same  general  result  for  any  given  risk  adjustment  method. 

There  were  no  significant  differences  between  the  demographic  and  chronic  flag  models  in  paired 
analyses  on  this  measure  at  the  $50,000  stoploss  level,  although  that  changes  at  lower  levels.  The 
same  is  true  for  pairs  of  the  demographic  and  ACG  models.  There  are  no  significant  differences 
between  the  chronic  flag  and  ACG  models  at  any  stoploss  level.  There  are  significant  differences 
between  ADGs  and  the  demographic  models  at  all  levels.  There  is  only  limited  difference  between 
the  ADG  and  chronic  flag  models  at  higher  stoploss  levels,  while  there  is  more  significant 
difference  across  the  lower  levels.  ADG  and  ACG  models  are  not  significantly  different  at  any 
level. 

The  general  pattern  of  change  in  the  significance  of  differences  across  stoploss  levels  on  this 
measure  seems  to  imply  that  significance  is  greater  at  lower  levels.  This  is  consistent  with  the 
pattern  of  individual-level  R2  results — that  lower  stoploss  levels  seem  to  exaggerate  differences 
between  risk  adjustment  methods.  If  the  McNemar  test  results  are  "true",  they  would  suggest  that 
the  choice  of  risk  adjustment  method  would  be  more  important  at  increasingly  lower  stoploss 
levels. 
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Bias  in  HMO-B.  The  results  on  group-level  measures  of  predictive  accuracy  for  HMO-B  are 
shown  in  sections  (d),  (e),  and  (f)  of  Table  5.4.  ACGs  appear,  at  first  glance,  to  perform  "the  best" 
of  any  method  on  each  of  the  three  measures,  but  the  overall  pattern  of  these  results  is  clearly 
biased  (they  are  not  consistent  with  the  pattern  of  results  given  the  individual-level  measures)  in 
some  way  that  has  not  been  suggested  in  any  of  the  previous  analyses  of  HMO-B.  ADGs  perform 
no  better  than  any  of  the  other  models,  and  the  pattern  of  results  across  stoploss  levels  is  erratic 
when  a  clearer  pattern  of  improvement  should  be  evident  at  increasingly  lower  stoploss  levels. 

In  every  reasonable  sense,  the  same  procedures  were  used  to  draw  samples,  to  generate  expectation, 
and  to  aggregate  results  in  both  study  sites.  Yet,  something  that  is  not  readily  evident  in  the 
individual-level  results  is  clearly  different  across  the  plans.  The  obvious  source  of  that  difference 
must  be  embodied  in  the  relationship  between  estimation  and  validation  samples  in  each  plan. 

In  order  to  explore  those  relationships  further,  the  mean  expected  values  for  the  estimation  samples 
were  divided  by  the  mean  expected  values  for  the  respective  validation  samples  in  each  plan.  Table 
5.5  shows  the  resultant  ratios  for  the  actual-dollar  model  and  the  separate  components  of  the  four- 
part  log  model,  given  stoploss  at  $25,000  and  for  selected  models.  Both  study  sites  are  included  in 
the  table.  There  are  noticeable  differences  in  the  distribution  of  charges  across  estimation  and 
validation  samples,  on  several  dimensions,  between  HMO-A  and  HMO-B.  Mean  expected  values 
are  much  the  same  for  the  underlying  samples  for  HMO-A,  as  evidenced  by  the  clustering  of  ratios 
around  1.00.  There  are  differences  across  risk  adjustment  methods  for  that  plan,  but  they  are  not 
particularly  pronounced. 


Table  5.5:  Ratios  of  Mean  Expectations  (estimation/validation) 


•stoploss  a'. 

A_G(1) 

j    CHR(i)  | 

ACG 

ADG{1) 

HMO-A 

One-part  raw  dollar  model 

1.02 

1.00 

0.98 

0.99 

Four-part  log  model  (ambulatory) 

1.01 

1.06 

0.99 

0.92 

Four-part  log  model  (inpatient) 

1.02 

0.96 

0.98 

1.02 

HMO-B 

One-part  raw  dollar  model 

1.13 

1.12 

1.08 

1.13 

Four-part  log  model  (ambulatory) 

1.08 

1.08 

1.06 

1.09 

Four-part  log  model  (inpatient) 

1.17 

1.16 

1.10 

1.14 
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Differences  in  mean  expected  values  were  much  greater  between  the  estimation  and  validation 
samples  in  HMO-B.  The  demographic  and  ACG  models  exhibit  more  pronounced  differences  in 
the  distribution  of  inpatient  and  ambulatory  costs  in  HMO-B  as  compared  to  that  in  HMO-A. 
Where  modest  differences  might  be  used  to  distinguish  the  four  basic  risk  adjustment  methods  in 
HMO-A,  the  ratios  for  the  demographic,  chronic  flag,  and  ADG  models  do  not  distinguish  those 
methods  in  HMO-B.  The  ratios  for  ACGs  are  noticeably  lower  than  those  for  the  other  methods  in 
HMO-B. 

The  estimation  and  validation  samples  are  more  representative  of  each  other  in  HMO-A,  than  in 
HMO-B,  at  least  as  reflected  in  mean  costs.  In  a  tangentially  related  context,  Dunn  et  al.  (1995) 
derived  parameter  estimates  for  risk  factors  from  a  very  large  sample  of  data  drawn  from  a  variety 
of  sites.  The  underlying  data  had  been  adjusted  for  differences  in  costs  across  geographic  areas  and 
across  years.  Expectations  were  calculated  in  each  separate  site  based  on  the  parameter  estimates 
from  the  full  set  of  data,  and  the  results  were  compared  to  test  the  stability  of  risk  weights  across 
pools.  The  researchers  also  made  an  adjustment  tied  to  mean  expenditures  to  account  for  service- 
level  differences  by  age  and  gender  in  each  site  (or  pool). 

To  the  extent  that  the  bias  evident  in  group-level  measures  of  predictive  accuracy  for  HMO-B  can 
be  attributed  to  mean  differences  in  the  samples,  some  correction  might  be  appropriate  to  apply. 
One  approach,  similar  to  that  used  by  Dunn,  might  be  to  adjust  for  the  differences  in  mean  expected 
costs  of  the  estimation  and  validation  samples.  At  the  same  time,  there  is  something 
counterintuitive  about  that  solution,  at  least  for  this  study,  since  the  purpose  of  risk  adjustment  is  to 
account  for  health  status  differences  so  that  it  is  possible  to  assess  other  differences  (such  as  costs). 
The  analysis  presented  in  the  next  section  suggests  that  a  better  generalized  solution  to  address 
population  differences  that  produce  bias  in  group-level  results  may  be  found  in  the  form  of  the 
regression  model  used  to  generate  the  underlying  expectations. 

Data  Treatment  Methods 

It  is  worth  noting,  again,  that  the  individual-level  results  for  each  of  the  log-based  models  were 
poorer  than  those  for  the  actual-dollar  models.  If  an  R2,  for  example,  is  a  dependable  measure  of 
the  relative  performance  of  two  applications  of  a  risk  adjustment  method,  and  basic  statistical 
assumptions  regarding  the  relationship  between  individual  and  group-level  mean  values  hold,  then 
it  is  reasonable  to  assume — based  on  the  individual-level  results  presented  earlier  (see  Table 
5.3) — that  group-level  measures  for  the  log-based  models  should  be  no  better,  or  worse,  than  those 
for  the  actual-dollar  models.  Table  5.6  presents  the  three  group-level  measures  of  predictive 
accuracy  for  both  HMO-A  and  HMO-B,  for  groups  of  3,000,  at  a  stoploss  level  of  $25,000.  In  this 
table,  rows  reflect  data  treatment  model  type  rather  than  stoploss  level.  As  noted  before,  a  full  set 
of  summary  results  for  all  the  models  is  included  in  Appendix  B. 
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Table  5.6:   Group-Level  Measures  by  Data  Treatment,  stoploss  at  $25,000,  groups  of  3,000 


Data 


w 

A  G(i)   j  A '  G(2)  '  ]  CHR(1 )  '  | 
Mean  Forecasting  Bias 

ACG 

ADG{1) 
HMO-A 

actual 

-6.0% 

-6.2% 

-4.6% 

A  70/ 

-4.  i  /o 

-2.9% 

-3.1% 

1  10/ 
-J.J  /o 

log 

-11.0% 

-12.2% 

-9.0% 

-9.8% 

3.7% 

2.2% 

2.2% 

two-part 

-8.6% 

-9.5% 

-15.3% 

-15.5% 

-4.0% 

4.5% 

3.9% 

four-part 

-4.0% 

-4.3% 

-3.7% 

-3.8% 

-2.3% 

-1.1% 

-1.4% 

Mean  Squared  Forecasting  Error 

HMO-A 

actual 

0.5% 

0.6% 

0.4% 

0.4% 

0.3% 

0.3% 

0.3% 

log 

1.4% 

1.6% 

1.0% 

1.1% 

0.4% 

0.3% 

0.3% 

two-part 

0.9% 

1.1% 

2.5% 

2.5% 

0.4% 

0.5% 

0.4% 

four-part 

0.3% 

0.4% 

0.3% 

0.3% 

0.3% 

0.2% 

0.2% 

(c) 

Percent  of  Groups  Within  5  Percent  of  Actual 

HMO-A 

actual 

53.3% 

56.7% 

66.7% 

65.0% 

78.3% 

80.0% 

78.3% 

log 

25.0% 

15.0% 

45.0% 

36.7% 

38.3% 

48.3% 

51.7% 

two-part 

41.7% 

33.3% 

3.3% 

3.3% 

71.7% 

16.7% 

28.3% 

four-part 

78.3% 

68.3% 

83.3% 

76.7% 

78.3% 

78.3% 

83.3% 

Mean  Forecasting  Bias 

HMO-B 

-8.5% 

-8.1% 

-7.8% 

-7.3% 

-4.6% 

-8.3% 

-8.1% 

log 

-6.7% 

-7.4% 

-0.9% 

-0.8% 

12.3% 

20.5% 

21.1% 

two-part 

-7.5% 

-7.6% 

-15.1% 

-15.1% 

-0.5% 

6.4% 

6.0% 

four-part 

-4.4% 

-4.1% 

-4.4% 

-4.1% 

1.2% 

-0.5% 

-0.3% 

/•   \  ::::::::: 

Mean  Squared  Forecasting  Error 

IIMO-B 

aciuai 

0.9% 

0.8% 

0.8% 

0.7% 

0.4% 

0.9% 

0.8% 

log 

0.6% 

0.7% 

0.2% 

0.2% 

1.8% 

4.7% 

4.9% 

two-part 

0.7% 

0.7% 

2.4% 

2.4% 

0.2% 

0.7% 

0.6% 

four-part 

0.4% 

0.4% 

0.4% 

0.4% 

0.2% 

0.2% 

0.2% 

1 

Percent  of  Groups  Within  5  percent  of  Actual 

HMO-B 

actual 

21.7% 

20.0% 

21.7% 

23.3% 

45.0% 

13.3% 

18.3% 

log 

31.7% 

26.7% 

73.3% 

75.0% 

6.7% 

0.0% 

0.0% 

two-part 

26.7% 

26.7% 

3.3% 

3.3% 

73.3% 

43.3% 

48.3% 

four-part 

43.3% 

48.3% 

45.0% 

46.7% 

78.3% 

73.3% 

71.7% 
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Analysis  of  the  actual-dollar  model  in  the  preceding  section  indicated  that  group-level  measures  for 
HMO- A  reflected  the  same  general  pattern  as  the  individual-level  results  across  risk  adjustment 
methods,  although  mean  forecasting  bias  went  up  slightly  for  all  models  at  the  lowest  stoploss 
level,  rather  than  down  as  expected.  Table  5.6  shows  that,  while  the  results  for  the  one  and  two- 
part  log  models  were  erratic  and  compare  poorly  to  those  for  the  actual-dollar  model,  the  four-part 
log  models  reflect  the  same  overall  pattern  as  the  actual-dollar  models  for  HMO-A  (at  the  $25,000 
stoploss  level),  and  that  the  change  in  data  treatment  produces  consistently  better  results  than  those 
for  the  actual-dollar  models  on  each  measure.  There  is  more  relative  improvement  for  the  simple 
demographic  and  ADG  models  in  HMO-A  as  a  consequence  of  using  the  four-part  treatment,  but 
the  pattern  of  results  across  the  four  basic  risk  adjustment  methods  remains  the  same.  Unlike  the 
results  for  the  actual-dollar  model,  measures  at  the  other  stoploss  levels  (included  in  Appendix  B) 
show  that  mean  forecasting  bias  goes  down  between  the  $10,000  and  $5,000  levels  for  ADGs,  as 
might  generally  be  expected  given  basic  principles  of  statistics,  using  the  four-part  log  treatment. 

The  pattern  of  results  on  group-level  measures  for  the  actual-dollar  model  in  HMO-B  reflected 
significant  bias  (see  Table  5.4).  There  seemed  to  have  been  no  dependable  relationship  between 
those  results  and  the  prior  individual-level  measures.  Group-level  results  for  HMO-B  using  the 
four-part  log  treatment  are  shown  in  sections  (d),  (e),  and  (f)  of  Table  5.6.  Those  data  show 
substantial  improvement  in  group-level  measures  as  a  consequence  of  using  the  four-part  log 
approach.  For  example,  mean  forecasting  bias  is  reduced  from  more  than  8  to  less  than  1  percent 
for  the  ADG(l)  model.  Mean  squared  forecasting  error  is  reduced  from  .9  to  .2  percent,  and  the 
percentage  of  groups  within  5  percent  of  actual  costs  goes  up  from  13  to  more  than  73  percent  for 
the  same  model.  That  improvement  is  evident  in  the  absolute  measures  for  each  model,  but  also 
intuitively  in  that  the  "usual"  pattern  of  relationships  across  risk  adjustment  methods  is  largely 
restored.  Similar  results  can  be  seen  at  all  stoploss  levels. 

Findings  based  on  both  HMO-A  and  HMO-B  suggest  that  the  four-part  log  treatment  improves 
and,  in  a  sense,  stabilizes  group-level  results  in  a  way  that  is  more  consistent  with  prior  individual- 
level  measures  of  model  performance.  Bias  related  to  assumptions  regarding  the  underlying 
distribution  of  costs,  and  not  otherwise  attributable  to  the  effect  of  a  given  risk  adjustment  method, 
was  removed  using  the  more  complex  data  treatment.  That  improvement  is  more  evident  in  the 
measures  for  HMO-B  because  there  is  more  underlying  difference  between  estimation  and 
validation  samples  in  HMO-B.  It  is  most  evident  in  the  results  for  ADGs  because  they  have  the 
least  otherwise-biased  relationship  to  the  underlying  distribution  of  costs.  This  illustrates  that  the 
extent  to  which  a  given  data  treatment  (using  actual  dollars  versus  the  four-part  log  treatment,  for 
example)  controls  for  bias  in  the  underlying  distribution  of  health  service  costs  will  be  more  evident 
as  the  need  for  that  adjustment  is  greater.  It  also  suggests  that  the  extent  of  selection  differences 
between  populations,  in  this  case  estimation  and  validation  samples,  will  be  evident  when 
alternative  assumptions  regarding  the  underlying  distribution  of  costs  are  applied. 
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The  group-level  measures,  such  as  those  shown  in  Table  5.6,  are  summary  measures  based  on  the 
predictive  ratios  of  60  groups  of  any  given  stoploss  level,  data  treatment,  and  group  size.  More 
succinctly,  the  group-level  measures  are  characterizations  of  the  error  associated  with  applying  risk 
adjusted  expectations.  Mean  forecasting  bias,  for  example,  is  the  mean  of  the  prediction  error 
evident  across  any  given  set  of  60  groups.  To  illustrate  the  contribution  to  predictive  accuracy  of 
using  the  four-part  log  treatment  versus  actual  dollars  further,  Figures  5. 1  through  5  .4  show  the 
distribution  of  prediction  error  for  the  60  groups  given  stoploss  at  $25,000  and  groups  of  3,000 
plan  members.  These  graphs  depict  the  group-specific  results  that  underlie  the  summary  measures 
in  Table  5.6. 

Figures  5. 1  and  5.2  show  the  distribution  of  prediction  error  for  each  of  the  seven  risk  adjustment 
models  applied  in  HMO-A  using  actual  dollars  and  the  four-part  log  data  treatment,  respectively. 
Figures  5.3  and  5.4  show  the  same  distributions  for  HMO-B.  There  is  an  evident  shift  in  prediction 
error  toward  a  mean  error  of  0  (zero)  using  the  four-part  data  treatment  in  both  plans.  As  the 
summary  measures  in  Table  5.6  indicate,  for  example,  mean  forecasting  bias  for  the  ADG(l)  model 
in  HMO-A  shifts  from  -3. 1  to  -1. 1  percent.  Mean  forecasting  bias  for  the  ADG(l)  model  in  HMO- 
B  shifts  from  -8.3  to  -0.5.  Where  the  group-level  results  based  on  the  use  of  actual  dollars  did  not 
appear  to  be  comparable  across  the  two  study  plans  (as  illustrated  by  Figures  5.1  and  5.3),  results 
based  on  the  use  of  the  four-part  log  treatment  are  strikingly  similar  for  the  two  sites  (compare 
Figures  5.2  and  5.4).  It  should  be  noted  that  the  relative  performance  of  the  risk  adjustment 
methods  was  also  affected  by  the  choice  of  data  treatment.  The  shift  in  mean  forecasting  bias  was 
less  pronounced  given  the  simple  demographic  models  as  opposed  to  the  ADG  models  in  HMO-B, 
for  example. 

Table  5.7  is  a  full  set  of  results  for  the  four-part  log  model  comparable  to  those  shown  earlier  for 
the  actual-dollar  model  in  Table  5.4.  This  table  reflects  groups  of  3,000,  as  does  the  earlier  table. 
There  is  clearly  an  overall  improvement  for  each  of  the  measures  in  the  table  at  every  stoploss  level 
for  both  HMOs.  Mean  forecasting  bias  goes  down  with  each  stoploss  level  for  the  ADG  model  in 
HMO-B,  where  it  had  appeared  erratic  across  those  levels  in  the  early  table  reflecting  the  actual- 
dollar  models.  The  bias  still  goes  up  slightly  for  the  demographic-based  models  at  lower  levels  in 
both  plans,  but  not  enough  to  otherwise  diminish  the  sense  of  improvement.  These  results  are 
meaningful  in  that  they  demonstrate  the  reduction  in  bias  that  can  be  attributable  to  the  generation 
of  the  underlying  expectations  as  opposed  to  bias  attributable  to  specific  risk  adjustment  methods. 

The  statistical  tests  described  earlier  in  the  discussion  of  Table  5.4  were  repeated  using  the  results 
of  the  four-part  model  for  HMO-A.  Paired  t-tests  of  mean  forecasting  bias  show  that,  unlike  the 
results  for  the  actual-dollar  model,  the  ACG  model  is  generally  significantly  different  from  the 
ADG  models,  except  for  the  lowest  stoploss  level.  The  significance  of  differences  between  the 
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Figure  5.3 
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Table  5.7:   Group-Level  Measures  by  Stoploss  Level,  using  a  4-part  log  treatment,  groups  of  3,000 


II 

level 

ff 

A.  G(l)  1 

A  G(2) 

j  CHR(I)  | 

CHRj(2) 

j  ACG 

I  ADG{i)  ] 

ADG(2) 

(a)     :|  ;       Mean  Forecasting  Bias    HMO-A 


(0 


50k 

-6.7% 

-6.9% 

-6.5% 

-6.5% 

-5.3% 

-3.8% 

-4.2% 

25k 

-4.0% 

-4.3% 

-3.7% 

-3.8% 

-2.3% 

-1.1% 

-1.4% 

10k 

-1.9% 

-2.4% 

-1.6% 

-2.0% 

-0.1% 

0.6% 

0.3% 

5k 

-2.4% 

-2.9% 

-2.0% 

-2.5% 

-0.5% 

0.1% 

-0.1% 

Mean  Squared  Forecasting  Error 

HMO-A 

50k 

0.7% 

0.7% 

0.6% 

0.6% 

0.5% 

0.4% 

0.4% 

25k 

0.3% 

0.4% 

0.3% 

0.3% 

0.3% 

0.2% 

0.2% 

10k 

0.2% 

0.2% 

0.2% 

0.2% 

0.2% 

0.1% 

0.1% 

5k 

0.2% 

0.2% 

0.1% 

0.2% 

0.1% 

0.1% 

0.1% 

Percent  of  Groups  Within  5  Percent  of  Actual 

HMO-A 

50k 

35.0% 

35.0% 

36.7% 

36.7% 

35.0% 

50.0% 

48.3% 

25k 

55.0% 

51.7% 

56.7% 

56.7% 

65.0% 

80.0% 

76.7% 

10k 

78.3% 

68.3% 

83.3% 

76.7% 

78.3% 

78.3% 

83.3% 

5k 

78.3% 

73.3% 

81.7% 

78.3% 

81.7% 

83.3% 

85.0% 

Mean  Forecasting  Bias 

HMO-B 

50k 

-2.4% 

-1.9% 

-2.6% 

-2.2% 

3.5% 

0.7% 

1.1% 

25k 

-4.4% 

-4.1% 

-4.4% 

-4.1% 

1.2% 

-0.5% 

-0.3% 

10k 

-4.4% 

-4.4% 

-4.1% 

-4.5% 

0.5% 

-0.3% 

-0.3% 

5k 

-3.8% 

-3.9% 

-3.3% 

-3.7% 

0.6% 

-0.2% 

-0.3% 

Mean  Squared  Forecasting  Error 

HMO-B 

50k 

0.3% 

0.3% 

0.3% 

0.3% 

0.4% 

0.3% 

0.3% 

25k 

0.4% 

0.4% 

0.4% 

0.4% 

0.2% 

0.2% 

0.2% 

10k 

0.4% 

0.3% 

0.3% 

0.3% 

0.1% 

0.1% 

0.1% 

5k 

0.3% 

0.3% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

Percent  of  Groups  Within  5  percent  of  Actual 

HMO-B 

50k 

65.0% 

63.3% 

61.7% 

65.0% 

56.7% 

68.3% 

68.3% 

25k 

43.3% 

48.3% 

45.0% 

46.7% 

78.3% 

73.3% 

71.7% 

10k 

51.7% 

50.0% 

53.3% 

51.7% 

81.7% 

80.0% 

80.0% 

5k 

60.0% 

60.0% 

71.7% 

58.3% 

93.3% 

86.7% 

83.3% 

101 


demographic  and  chronic  flag  models  seems  to  disappear  on  this  measure  with  increasing  stoploss 
levels.  The  A_G(1)  and  A_G(2)  models  are  significantly  different  at  each  stoploss  level. 

McNemar  tests  of  the  percentage  of  groups  within  5  percent  of  actual  costs  also  reflect  change  in 
results  between  the  actual-dollar  and  four-part  log  models.  There  were  no  significant  differences 
between  the  demographic  and  chronic  flag  models  at  any  stoploss  level  (except  one  seemingly 
anomalous  test  at  the  $  10,000  level).  There  were  significant  differences  between  the  ADG  and 
chronic  flag  models  at  the  two  highest  stoploss  levels.  There  were  also  significant  differences 
between  the  ADG  and  ACG  models  at  those  higher  levels.  Differences  remained  not  significant  for 
each  continuous  and  categorical  model  pair,  and  for  pairs  of  the  ACG  and  chronic  flag  models,  at 
all  stoploss  levels.  Aside  from  the  one  exception  already  noted,  there  were  no  statistical  differences 
between  any  of  the  models  at  the  two  lowest  stoploss  levels  on  this  measure.  Again,  tables 
reflecting  these  patterns  are  included  in  Appendix  B. 

Where  the  significance  of  differences  seemed  to  be  greater  on  this  measure  at  lower  stoploss  levels 
for  the  actual-dollar  model,  results  using  the  four-part  log  treatment  show  greater  significance  at 
higher  stoploss  levels.  In  other  words,  increasingly  lower  stoploss  thresholds  reduce  the 
significance  of  differences,  between  risk  adjustment  models,  in  the  number  of  groups  that  fall 
within  5  percent  of  actual  costs  in  HMO-A.  This  seems  to  be  related  to  relatively  greater 
improvement  in  the  results  for  the  demographic  and  chronic  flag  models  using  the  four-part 
treatment.  There  are  only  minor  differences  (with  respect  to  the  pattern  of  statistical  significance 
across  risk  adjustment  methods)  in  the  results  for  the  $50,000  and  $25,000  stoploss  levels, 
although  differences  are  apparent  in  absolute  terms  across  those  levels. 

Given  these  patterns  of  statistical  significance,  the  four-part  log  treatment  of  the  dependent  cost 
variable  used  in  these  models  seems  to  clarify  the  extent  of  differences  across  models,  relative  to 
using  actual  dollars.  On  one  hand,  the  four-part  results  seem  to  suggest  finer  distinctions  across 
models  in  underlying  differences  of  predicted  and  actual  costs  (mean  forecasting  bias).  There  were 
also  more  discernable  differences  between  ADGs  and  both  the  ACG  and  chronic  flag  models  in  the 
percentage  of  groups  within  5  percent  of  actual  at  higher  stoploss  levels.  On  the  other  hand,  there 
was  less  statistical  significance  in  the  difference  between  risk  adjustment  methods,  as  measured  by 
the  number  of  groups  affected,  at  lower  stoploss  levels  using  the  four-part  treatment. 

There  is  overall  improvement  in  absolute  terms  for  each  of  the  models  using  the  four-part  log 
treatment  in  HMO-B.  However,  the  pattern  of  absolute  results  in  the  percentage  of  groups  within  5 
percent  of  actual  does  not  conform  to  the  expected  pattern  across  stoploss  levels  for  the 
demographic-based  models  in  HMO-B  (percentages  go  up,  rather  than  down  at  the  $50,000 
stoploss  level).  Moreover,  the  McNemar  tests  show  relatively  more  significant  differences  at  lower 
stoploss  levels.  The  differences  that  emerge  are  the  result  of  differences  in  the  response  for  the 


102 


ACG  and  ADG  models,  as  opposed  to  the  demographic  and  chronic  flag  models,  to  the  change  in 
data  treatment.  The  ACG  and  ADG  models  now  conform  to  the  expected  pattern  of  results  across 
stoploss  levels,  whereas  the  demographic-based  models  do  not.  This  is  indicative  of  some  level  of 
bias  in  demographic-based  models  that  was  not  addressed  using  the  four-part  approach,  as  it  was 
for  ADGs  and  ACGs.  This  also  illustrates  that  the  nature  of  bias  associated  with  the  demographic- 
based  models  is  fundamentally  different  than  the  bias  inherent  in  ACGs  and  ADGs.  Further,  this  is 
consistent  with  the  discussion  of  the  shift  in  relative  performance  across  risk  adjustment  models 
that  is  evident  for  HMO-B  in  Figures  5.3  and  5.4.  In  a  broader  context,  this  can  be  seen  as  formal 
evidence  of  the  inherent  superiority  of  risk  adjustment  methods  based  on  more  detailed  clinical 
data,  as  opposed  to  primarily  demographic  methods,  to  account  for  selection  bias  in  the  distribution 
of  health  care  costs. 

As  noted  earlier,  using  a  four-part  log  treatment  with  the  data  in  this  study  seems  to  clean-up 
(suggest  more  predictable  patterns  given  basic  statistical  principles)  and  improve  (in  absolute 
terms)  group-level  measures  of  predictive  accuracy.  Those  group-level  measures  also  seem  to  be 
more  consistently  in  line  with  prior  expectations  given  individual-level  measures.  Tables  showing 
the  results  of  both  the  paired  t  and  McNemar  tests,  for  HMO-A,  for  the  full  set  of  four-part  log 
results  are  included  in  Appendix  B.  Comparable  results  based  on  the  actual-dollar  model,  given 
groups  of  3,000,  that  support  the  discussion  of  Table  5.4  are  also  included  in  that  Appendix. 

Group  Size 

According  to  basic  statistical  principles  underlying  the  estimation  of  expectations,  the  variance  in 
the  distribution  of  sample  means,  such  as  the  means  used  to  estimate  risk  adjusted  expectations  of 
costs,  is  related  to  the  variance  of  the  underlying  individual  measures  by  some  function  of  the  size 
of  the  sample  drawn  from  the  underlying  population  of  measures.  If  the  mean  is  an  unbiased 
estimate  of  costs,  then  the  mean  of  sample  means  should  have  a  predictable  relationship  to  the  size 
of  the  samples.  In  other  words,  measures  such  as  mean  forecasting  bias  and  mean  squared 
forecasting  error  should  get  better  as  the  relative  size  of  samples  based  on  underlying  expectations 
goes  up.  Table  5.8  shows  the  three  measures  of  predictive  accuracy  by  group  size,  based  on  the 
four-part  log  model  and  a  stoploss  level  of  $25,000.  Data  for  both  HMO-A  and  HMO-B  are 
included  in  the  table. 

As  a  rule,  the  mean  forecasting  bias  and  mean  squared  forecasting  error  go  down  (improve)  across 
risk  adjustment  methods — from  the  demographic,  to  the  chronic  flag,  to  the  ACG,  to  the  ADG 
models.  The  percentage  of  groups  within  5  percent  of  actual  costs  goes  up  across  methods.  These 
are  the  "usual"  patterns  of  relationships  between  methods  for  these  measures.  The  usual  patterns 
for  these  measures  between  stoploss  levels  is  that,  as  the  level  goes  down,  mean  forecasting  bias 
and  mean  squared  forecasting  error  go  down,  and  the  percent  of  groups  in  the  5  percent  band  goes 
up.  These  patterns  are  meaningful  in  that  they  provide  evidence  that  differences  that  appear  in 
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Table  5.8:   Group-Level  Measures  by  Group  Size,  using  a  4-part  log  treatment,  stoploss  at  $25,000 
Group 

"Size    II  AJG(I)  j  A_C(2)  I  CHR<1)  f  CHR(2)  |    ACG     [  ADG(I)  1  ADG(2) 


iilllll:! 

Mean  Forecasting  Bias 

HMO-A 

5000 

-3.2% 

-3.4% 

-2.8% 

-2.8% 

-1.2% 

-0.1% 

-0.3% 

3000 

-4.0% 

-4.3% 

-3.7% 

-3.8% 

-2.3% 

-1.1% 

-1.4% 

1500 

-3.2% 

-3.3% 

-2.8% 

-2.7% 

-0.8% 

0.5% 

0.4% 

500 

-3.7% 

-3.5% 

-3.4% 

-3.0% 

-1.0% 

0.0% 

0.2% 

(b) 

Mean  Squared  Forecasting  Error 

HMO-A 

5000 

0.2% 

0.2% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

3000 

0.3% 

0.4% 

0.3% 

0.3% 

0.3% 

0.2% 

0.2% 

1500 

0.6% 

0.6% 

0.6% 

0.6% 

0.7% 

0.5% 

0.5% 

500 

1.6% 

1.6% 

1.5% 

1.5% 

1.5% 

1.3% 

1.3% 

(c) 

Percent  of  Groups  Within  5  Percent  of  Actual 

HMO-A 

5000 

68.3% 

66.7% 

68.3% 

66.7% 

75.0% 

85.0% 

85.0% 

3000 

55.0% 

51.7% 

56.7% 

56.7% 

65.0% 

80.0% 

76.7% 

1500 

46.7% 

48.3% 

48.3% 

45.0% 

51.7% 

58.3% 

56.7% 

500 

18.3% 

21.7% 

20.0% 

28.3% 

38.3% 

38.3% 

33.3% 

(d) 

Mean  Forecasting  Bias 

HMO-B 

5000 

-4.0% 

-3.7% 

-4.0% 

-3.8% 

1.9% 

-0.3% 

-0.0% 

3000 

-4.4% 

-4.1% 

-4.4% 

-4.1% 

1.2% 

-0.5% 

-0.3% 

1500 

-4.1% 

-3.8% 

-4.4% 

-4.2% 

1.2% 

-1.0% 

-0.8% 

500 

-4.2% 

-4.3% 

-3.7% 

-3.9% 

0.8% 

-2.0% 

-1.9% 

Mean  Squared  Forecasting  Error 

HMO-B 

5000 

0.2% 

0.2% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

3000 

0.4% 

0.4% 

0.4% 

0.4% 

0.2% 

0.2% 

0.2% 

1500 

0.7% 

0.6% 

0.6% 

0.6% 

0.6% 

0.5% 

0.5% 

500 

1.4% 

1.3% 

1.3% 

1.3% 

1.7% 

1.5% 

1.6% 

Percent  of  Groups  Within  5  percent  of  Actual 

HMO-B 

5000 

65.0% 

68.3% 

65.0% 

70.0% 

83.3% 

96.7% 

95.0% 

3000 

43.3% 

48.3% 

45.0% 

46.7% 

78.3% 

73.3% 

71.7% 

1500 

41.7% 

45.0% 

50.0% 

53.3% 

41.7% 

45.0% 

45.0% 

500 

33.3% 

33.3% 

36.7% 

33.3% 

28.3% 

33.3% 

35.0% 
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results  across  risk  adjustment  methods  and  across  stoploss  levels  are  not  the  result  of  inappropriate 
assumptions  regarding  the  distribution  of  costs,  or  some  other  source  of  bias.  Bias  that  is 
independent  of  any  specific  risk  adjustment  method  will  confound  the  comparison  of  results  across 
all  risk  adjustment  methods. 

Where  the  pattern  of  results  for  mean  forecasting  bias  seems  to  be  consistent  across  risk  adjustment 
methods  and  across  stoploss  levels,  sections  (a)  and  (d)  of  Table  5.8  show  that  this  measure  does 
not  necessarily  vary  in  a  dependable  pattern  by  group  size.  Where  it  does  vary,  the  change  does  not 
seem  to  be  related  to  changes  in  the  other  measures.  Another  way  to  say  this  is  that  mean 
forecasting  bias  is  not  a  dependable  measure  of  differences  attributable  to  group  size  alone.  It  may, 
however,  be  indicative  of  a  more  fundamental  bias  in  the  expectations.  Both  ADG  models  are 
synchronized  across  all  three  measures  of  predictive  accuracy  in  the  results  for  HMO-B  presented 
in  Table  5.8  (sections  d,  e,  and  f).  In  a  very  expansive  way,  this  could  be  described  as  the  most 
finely-tuned  result  in  this  study.  That  is,  despite  any  bias  in  the  underlying  populations  (the 
estimation  and  validation  samples),  ADGs  produced  the  singularly  best  results  based  on  the 
statistical  assumptions  considered  in  this  analysis.  Those  results  even  conform  to  properties 
regarding  the  relationship  between  sample  size  and  expectations  based  on  sample  means.  At  a 
stoploss  level  of  $25,000,  mean  forecasting  bias  improves  as  group  size  goes  up  with  ADGs  in 
HMO-B,  as  it  does  when  the  stoploss  level  goes  down  given  groups  of  3,000.  None  of  the  results 
for  other  methods  "behave"  quite  so  well.  A  better  risk  adjustment  method,  holding  all  other 
factors  constant,  would  exhibit  the  same  pattern  across  group  sizes  but  show  less  mean  bias  in 
smaller  group  sizes. 

The  implications  of  these  results  are  important  to  understand,  although  they  simply  illustrate  a 
common  understanding.  In  the  absence  of  any  confounding  bias,  mean  forecasting  bias  should  get 
smaller  as  the  sample  size  gets  larger — because  the  group  size  gets  larger.  Mean  forecasting  bias 
changes  in  an  appropriately  predictable  pattern  for  HMO-B,  getting  smaller  with  increasing  group 
size,  because  there  is  no  other  substantial  confounding  bias.  The  demographic-based  models 
exhibit  roughly  the  same  level  of  forecasting  bias  regardless  of  group  size  because  of  bias  that  is 
inherent  in  those  models.  The  level  of  that  bias  is  apparently  unaffected  by  group  size.  If  ADGs 
exhibited  a  different  pattern  of  results  on  mean  forecasting  bias  at  another  stoploss  level  (which,  in 
fact,  they  do  at  a  stoploss  level  of  $50,000),  then  some  aspect  of  the  relative  difference  between 
alternative  risk  adjustment  methods  would  have  to  be  attributed  to  the  choice  of  stoploss  level  and 
not  to  differences  inherent  in  the  risk  adjustment  methods. 

The  same  implication  holds  in  comparing  relative  results  across  health  plans.  Some  aspect  of  the 
relative  relationship  between  simple  demographic  and  ADG  models  in  HMO-A  is  due  to 
confounding  bias  that  is  evident  in  the  pattern  of  results  on  mean  forecasting  bias  for  the  ADG 
model.  In  other  words,  the  "true"  pattern  of  relationships  between  alternative  risk  adjustment 
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methods  is  more  clearly  defined  in  the  results  for  HMO-B,  than  in  those  for  HMO-A.  Moreover, 
that  unidentified  bias  in  HMO-A  is  not  likely  to  be  due  to  underlying  distributional  effects  since 
those  were  minimized  across  the  two  health  plans  using  the  four-part  log  treatment.  It  may,  for 
example,  be  due  to  some  aspect  of  differences  in  costs  between  the  estimation  and  validation 
samples  in  HMO-A. 

If  results  based  on  mean  forecasting  bias  are  "true",  as  they  appear  to  be  for  HMO-B,  then  results 
based  on  other  measures  should  also  be,  in  the  same  sense,  true.  Thus,  it  is  appropriate,  in  some 
sense,  to  say  that  the  results  on  the  percentage  of  groups  within  5  percent  of  actual  costs  for  HMO- 
B,  shown  in  section  (f)  of  Table  5.8,  are  a  more  appropriate  representation  of  the  relative  difference 
between  ADGs  and  simple  demographic  methods  on  that  measure,  than  the  comparable  results  for 
HMO-A.  It  is  important  to  remember,  however,  that  the  results  for  the  demographic-based  models 
are  still  subject  to  bias  that  is  inherent  in  those  methods  (and  that  is  evident  in  mean  forecasting 
bias),  that  is  not  readily  evident  in  the  pattern  of  this  measure  across  group  sizes.  It  is  not  clear 
what  the  sum  or  direction  of  the  effect  of  bias  evident  in  section  (d)  of  the  table  is  on  the  percentage 
of  groups  within  5  percent  of  actual  costs.  The  patterns  in  section  (f)  of  the  table  imply  that  if  the 
groups  are  small  enough,  or  large  enough  (larger  than  5,000  members)  the  choice  of  risk 
adjustment  method  will  not  matter.  Of  course,  that  is,  in  terms  of  this  measure.  If  a  3  percent  band 
were  used  instead  of  the  5  percent  band  included  here,  the  percentage  of  groups  within  that 
narrower  band  might  actually  get  smaller  with  higher  group  sizes  because  the  variance  in  the 
prediction  error  would  get  smaller  around  the  mean  forecasting  bias  of  approximately  4  percent, 
which  is  outside  of  a  3  percent  band.  In  any  case,  the  results  on  mean  squared  forecasting  error 
shown  in  sections  (b)  and  (e)  of  Table  5.8  show  that  the  distribution  of  whatever  mean  level  of 
error  that  is  evident  will  improve  (be  more  evenly  distributed)  as  the  size  of  the  groups  get  larger. 

It  may  be  easier  to  get  the  groups  to  perform  well  as  a  whole  in  terms  of  a  band  of  error  around 
actual  costs  than  to  address  bias  underlying  that  measure.  If  the  principal  concern,  in  the  allocation 
of  reimbursement  for  the  provision  of  health  care  services  is  "the  bottom  line",  as  measured  in 
gross  (or  summary)  terms  such  as  the  number  of  groups  within  such  a  band,  then  good  performance 
results  can  be  achieved  using  any  method  of  risk  adjustment  given  groups  of  large  enough  size  and 
performance  criteria  that  encompass  a  sufficiently  large  margin  of  error.  If  the  principal  concern  is 
bias  in  that  allocation  that  affects  specific  units  of  providers,  and  interest  in  risk  adjustment 
methods  to  address  selection  bias  suggests  that  it  is,  then  the  choice  of  a  risk  adjustment  method 
may  make  a  difference  if  one  method  can  be  shown  to  control  for  bias  better  than  another.  The 
results  of  this  study  illustrate  that  there  is  less  bias  associated  with  ADGs,  than  with  demographic- 
based  methods,  and  that  some  aspect  of  comparative  results  that  include  demographic-based 
methods  is  a  function  of  bias  inherent  in  the  those  methods. 
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As  a  reminder  of  the  earlier  findings  regarding  the  choice  of  data  treatment,  these  results  are  subject 
to  prior  assumptions  regarding  the  underlying  distribution  of  costs.  Table  5  .9  shows  the 
percentages  of  groups  within  5  percent  of  actual  costs  by  group  size  before  and  after  using  a  four- 
part  log  treatment  in  both  HMO-A  and  HMO-B.  The  contribution  on  this  measure  of  using  the 
four-part  treatment  is  evident,  in  absolute  terms,  but  not  pronounced  in  the  results  for  HMO-A 
(compare  sections  a  and  b).  Improvement  in  the  number  of  groups  within  the  5  percent  band  is 
strikingly  evident  in  the  patterns  of  results  across  group  size  in  HMO-B  (compare  sections  c  and  d). 
When  the  underlying  expectations  were  calculated  using  actual  dollars  in  that  plan,  the  percentage 
of  groups  within  the  band  went  down,  rather  than  up,  with  the  size  of  the  groups  for  all  models 
except  the  model  based  on  ACGs.  One  implication  is  that,  in  the  case  of  both  the  demographic- 
based  and  ADG  models,  the  size  of  the  groups  may  not  be  relevant  at  all  if  assumptions  regarding 
the  distribution  of  costs  are  not  properly  addressed.  That  is,  simply  increasing  the  size  of  the 
population  subject  to  risk  adjusted  payment  will  not  necessarily  make  up  for  the  underlying  bias. 
To  the  extent  that  selection  may  be  a  concern,  the  absence  of  assurance  that  assumptions  regarding 
the  underlying  distribution  of  costs  are  properly  addressed  would  be  an  incentive  to  practice  risk 
selection,  again,  regardless  of  the  risk  adjustment  method. 

Figures  5.5  through  5.8  depict  the  distribution  of  prediction  error  by  group  size  for  the  ADG(l) 
model  results  included  in  Table  5.9.  These  graphs  illustrate  the  effects  of  applying  the  four-part 
log  treatment  versus  actual-dollar  values  in  much  the  same  way  as  the  previous  set  of  graphs.  As 
was  the  case  in  the  earlier  graphs,  the  improvement  due  to  the  use  of  the  four-part  log  treatment  is 
less  evident  in  the  results  for  HMO-A,  Figures  5.5  and  5.6,  than  for  HMO-B,  Figures  5.7  and  5.8. 
Again,  as  was  evident  in  the  previous  set  of  graphs,  the  four-part  log  treatment  produced 
comparable  results  across  the  two  study  plans  (compare  Figures  5.6  and  5.8),  while  the  results 
derived  using  actual  dollars  were  not  evidently  comparable  (Figures  5.5  and  5.7).  The  lack  of 
comparability,  across  the  two  plans,  of  results  derived  using  actual  dollars  can  be  attributed  to  bias 
that  was  introduced  by  a  failure  of  assumptions  regarding  the  underlying  distribution  of  the 
dependent  costs,  and  not  to  differences  between  risk  adjustment  methods. 

Increasingly  lower  stoploss  levels  can  be  used  to  achieve  a  similar  effect  as  increasingly  larger 
group  sizes.  The  McNemar  tests  of  the  percentages  of  groups  within  5  percent  of  actual  costs, 
shown  in  Appendix  B,  generally  show  that  there  are  no  statistically  significant  differences  between 
risk  adjustment  methods  in  groups  of  any  size  in  this  study  at  stoploss  levels  of  $10,000  or  less 
(based  on  the  four-part  treatment),  because  the  numbers  of  groups  within  the  band  are  equally  high. 
There  are  no  significant  differences  between  the  demographic  and  chronic  flag  models  at  any 
stoploss  level.  There  are  significant  differences  between  ADGs  and  the  other  methods  at  the  higher 
stoploss  levels.  There  are  significant  differences  between  the  chronic  flag  and  ACG  models  that 
seem  to  be  related  to  groups  size.  They  are  more  significant  as  group  size  goes  up  (ACGs  get 
relatively  better).  There  are  similar  differences  between  the  ADG  and  ACG  models  that  seem  to  be 
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Table  5.9:    Percent  of  Groups  within  5%  of  Actual  by  Group  Size, 


untransformed  dollars  versus  4-part  log  treatment,  stoploss  at  $25,000 


Group 

€HR{1) 

11111111111 

"cvmi)  • 

ACG  | 

AIK5{2) 

(a) 

One-part  model  using  actual  dollars 

HMO-A 

5000 

48.3% 

45.0% 

60.0% 

58.3% 

75.0% 

76.7% 

78.3% 

3000 

41.7% 

38.3% 

55.0% 

56.7% 

61.7% 

65.0% 

63.3% 

1500 

36.7% 

35.0% 

41.7% 

50.0% 

50.0% 

Cf\  /Aft / 

50.0% 

f  a  AO/ 

50.0% 

500 

18.3% 

20.0% 

18.3% 

23.3% 

35.0% 

40.0% 

41.7% 

0>) 

Four-part  model  using  log  transformation 

HMO-A 

5000 

68.3% 

66.7% 

68.3% 

66.7% 

75.0% 

85.0% 

85.0% 

3000 

55.0% 

51.7% 

56.7% 

56.7% 

65.0% 

80.0% 

76.7% 

1500 

46.7% 

48.3% 

48.3% 

45.0% 

51.7% 

58.3% 

56.7% 

500 

18.3% 

21.7% 

20.0% 

28.3% 

38.3% 

38.3% 

33.3% 

(c) 

One-part  model  using  actual  dollars 

HMO-B 

5000 

13.3% 

16.7% 

16.7% 

18.3% 

63.3% 

11.7% 

13.3% 

3000 

21.7% 

20.0% 

21.7% 

23.3% 

45.0% 

13.3% 

18.3% 

1500 

28.3% 

31.7% 

30.0% 

33.3% 

41.7% 

30.0% 

31.7% 

500 

36.7% 

36.7% 

33.3% 

36.7% 

28.3% 

30.0% 

31.7% 

0*) 

Four-part  model  using  log  transformation 

HMO-B 
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65.0% 

68.3% 

65.0% 

70.0% 

83.3% 

96.7% 

95.0% 

3000 

43.3% 

48.3% 

45.0% 

46.7% 

78.3% 

73.3% 

71.7% 

1500 

41.7% 

45.0% 

50.0% 

53.3% 

41.7% 

45.0% 

45.0% 

500 

33.3% 

33.3% 

36.7% 

33.3% 

28.3% 

33.3% 

35.0% 

Figure  5.5 
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Figure  5.6 


Prediction  Error  by  Group  Size,  ADG(l)  (HMO- A) 

4  Part  Log  Model,  Stoploss  at  $25,000 
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Figure  5.7 
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Prediction  Error  by  Group  Size,  ADG(l)  (HMO-B) 

Untransformed  Dollar  Model,  Stoploss  at  $25,000 
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related  to  stoploss  level  (ADGs  get  relatively  better  as  the  level  goes  up).  As  was  the  case  with 
earlier  evidence,  it  is  difficult  to  determine  whether  these  differences  suggest  something  inherent 
about  the  risk  adjustment  methods  or  whether  they  suggest  some  other  source  of  bias. 
Nevertheless,  just  as  differences  between  risk  adjustment  methods  can  be  obscured  in  gross 
measures  of  performance  using  groups  of  large  size,  lower  stoploss  levels  will  also  obscures  such 
differences. 

It  is  important  to  remember  that  what  may  be  obscured  at  a  gross  level  of  measurement  by  the 
reduction  in  prediction  error  due  to  the  size  of  the  groups  or  lower  stoploss  levels  is  any  form  of 
bias,  including  bias  inherent  in  each  risk  adjustment  method.  The  analysis  presented  here  is 
intended,  in  part,  to  illustrate  the  potential  for  a  trade-off  between  summary  measures  of 
performance  and  addressing  the  bias  underlying  those  measures.  A  parallel  distinction  can  be  made 
between  the  summary  assessment  of  alternative  risk  adjustment  methods  as  measured  at  the 
individual  level  and  more  subtle  differences  that  are  evident  using  group-level  measures.  In  the 
course  of  this  analysis,  two  otherwise  comparable  health  plans  exhibited  strikingly  similar  results 
across  risk  adjustment  methods  using  individual-level  measures  alone.  Bias  associated  with 
assumptions  regarding  the  underlying  distribution  of  health  service  costs  (and  unrelated  to  any 
specific  risk  adjustment  method)  was  largely  obscured  in  those  measures.  More  pronounced 
selection  differences  in  estimation  and  validation  samples  in  HMO-B,  as  compared  to  HMO-A, 
made  the  importance  of  those  assumptions  more  evident.  Addressing  the  bias  related  to  the 
underlying  distribution  of  costs  made  the  relative  performance  of  alternative  risk  adjustment 
methods  more  comparable  across  the  two  health  plans,  and  the  assessment  of  their  relative 
performance  more  certain.  Removing  one  systematic  component  of  bias,  related  to  the  underlying 
distribution  of  costs,  also  made  the  sum  of  the  remaining  bias  evident — in  group-level  measures. 

The  results  of  this  study  illustrate  a  cautionary  point.  Improvements  related  to  using  a  four-part  log 
treatment  were  evident,  though  much  less  obvious,  in  measures  drawn  from  HMO-A  because  there 
was  less  bias  between  estimation  and  validation  samples  in  that  plan.  Those  samples  were  much 
like  samples  drawn  using  strict  random  sampling  from  a  specific  population,  which  reduce  bias 
across  samples  regardless  of  its  source.  Strict  random  samples  are  commonly  used  to  compare  the 
relative  performance  of  alternative  risk  adjustment  methods.  They  provide  an  idealized  perspective 
on  the  relative  performance  of  alternative  methods.  However,  alternative  risk  adjustment  methods 
do  not  necessarily  respond  to  different  sources  of  bias  in  the  same  way.  This  was  evident  in  that 
the  correction  made  in  mean  bias  in  HMO-B  using  the  four-part  log  treatment  was  much  greater  for 
the  ADG  models  than  for  the  demographic  models.  Unless  the  independent  components  of  bias  are 
understood,  and  addressed,  it  may  be  very  difficult  to  discern  whether  the  source  of  differences  (or 
the  lack  of  differences)  in  the  performance  of  risk  adjustment  alternatives  is  a  function  of  the  risk 
adjustment  method,  or  other  aspects  of  the  process  of  risk  adjustment  such  the  underlying 
distribution  of  costs  or  the  size  of  groups  or  the  choice  of  stoploss  level.  Theoretically,  those  other 
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aspects  could  include  any  independent  component  of  the  process  of  applying  risk  adjusted  rates  for 
payment. 

There  are  minor  differences  in  the  performance  of  ADGs  in  HMO-A  and  HMO-B  that  need  to  be 
explored  first,  but  a  logical  extension  of  this  analysis  would  be  to  combine  the  data  from  these  two 
plans  to  see  the  related  effects.  That  would  add  additional  components  of  bias  that  would  be  easier 
to  identify  once  the  sources  of  bias  within  one  population  are  better  understood.  The  results  of  this 
study  suggest  that  ADGs,  and  by  extension  ACGs,  may  address  enough  of  the  bias  associated  with 
selection  within  these  populations,  without  contributing  substantial  confounding  bias,  that  they 
might  be  used  as  the  basis  for  identifying  other  sources  of  bias  when  data  from  health  plans  that  are 
as  similar  as  the  two  in  this  study  are  combined. 

Hypotheses 

The  third  set  of  hypotheses  posed  in  Chapter  3  raised  the  issue  of  the  relationship  between 
individual  and  group-level  measures.  This  analysis  shows  that  bias  associated  with  assumptions 
regarding  the  underlying  distribution  of  costs  is  not  readily  apparent  in  individual-level  measures. 
It  is  apparent  in  group-level  measures,  and  increasingly  so  as  selection  issues  are  evident.  More 
broadly,  the  nature  and  extent  of  bias  can  be  obscured  in  individual-level  results.  That  is  why  R2 
values  were  no  better,  or  slightly  lower,  using  a  four-part  log  treatment,  rather  than  actual-dollar 
amounts,  in  each  study  plan,  even  though  group-level  measures  showed  evidence  of  the  reduction  in 
bias  (particularly  in  HMO-B)  using  the  four-part  treatment.  The  results  based  on  actual-dollar 
amounts  were  misleading  to  some  degree  because  of  bias  related  to  assumptions  regarding  the 
underlying  distribution  of  costs.  In  other  words,  the  question  should  not  necessarily  be  which  set  of 
R2  values  is  higher  (and,  therefore,  better),  but  which  set  of  expectations  provide  better  estimates  of 
future  costs.  In  order  to  determine  whether  using  a  four-part  log  treatment  is  "better  enough"  to 
justify  its  use  or,  indeed,  to  determine  whether  differences  between  alternative  risk  adjustment 
methods  are  great  enough  to  justify  their  use,  will  depend  on  the  extent  to  which  "the  bottom  line", 
as  measured  in  terms  of  gross  (or  summary)  performance  measures,  can  be  understood  in  terms  of 
bias — including,  but  not  limited  to  the  effects  of  selection — underlying  that  bottom  line,  and  vice 
versa. 

The  correction  in  group-level  results  associated  with  using  the  four-part  treatment  was  more 
pronounced  for  HMO-B,  than  for  HMO-A.  Again,  that  was  because  more  correction  was  needed 
given  the  bias  across  estimation  and  validation  samples  for  HMO-B.  Bias  that  remains,  having 
used  the  four-part  treatment,  should  be  evident  in  the  pattern  of  differences  across  stoploss  levels 
and  by  group  size  (at  least  in  the  range  of  stoploss  levels  and  groups  sizes  included  in  this  study). 
That  remaining  bias  may  be  due  to  unaccounted  for  health  status  differences.  Bias  may  also  be  the 
result  of  other  differences  related  to  costs  that  are  not  directly  related  to  health  status.  This  might 
explain  why  the  final  results  for  HMO-A  are  not  precisely  "true"  across  all  the  performance 
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measures  for  ADGs  (mean  forecasting  bias  is  not  fully  synchronized  with  changes  in  stoploss  level 
and  group  size),  even  though  the  underlying  estimation  and  validation  samples  appear  to  be  more 
comparable  samples  of  the  same  population,  than  they  do  in  HMO-B.  Bias  may  also  be  inherent  in 
a  risk  adjustment  method.  Thus,  any  differences  in  the  relative  performance  of  alternative  risk 
adjustment  methods  need  to  be  considered  in  terms  of  that  inherent  bias. 

With  that  as  a  critical  subtext,  group-level  measures  do  generally  improve  with  increasingly  lower 
stoploss  levels.  The  practical  effect  of  that  improvement — as  measured,  for  example,  in  the 
number  of  groups  that  fall  within  a  given  band  of  error — is  to  minimize  the  relative  differences 
between  risk  adjustment  methods.  Within  the  context  of  making  a  choice  between  alternative  risk 
adjustment  methods,  any  gain  in  measures  of  performance  that  are  attributable  to  stoploss  level 
alone  are  made  at  the  expense  of  the  introduction  of  the  relative  bias  associated  with  those 
alternatives.  In  other  words,  lower  stoploss  levels  cover-up  identifiable  bias  in  the  distribution  of 
costs. 

The  purpose  of  risk  adjustment  in  estimating  health  service  costs  is  to  explain  health  status 
differences  across  populations  so  that  reimbursement  accounts  for  those  differences.  The  purpose 
of  reinsurance  is  to  limit  the  risk  of  otherwise  uncontrollably  high  costs  so  that  a  provider  entity  can 
operate  on  an  on-going  basis  within  acceptable  parameters  of  risk.  Ideally,  a  stoploss  level  should 
be  set  no  lower  than  the  point  at  which  costs  become  unpredictable.  In  the  presence  of  adequate 
risk  adjustment,  the  relative  performance  of  provider  entities  would  depend  solely  on  their  ability  to 
manage  the  risk  they  assume.  From  the  perspective  of  traditional  insurance  theory,  it  would  be 
inappropriate  to  use  stoploss  to  offset  available  gains  in  explanatory  power  of  alternative  risk 
adjustment  methods,  because  those  gains  would,  by  definition,  be  at  the  expense  of  information  that 
could  be  used  to  explain  differences  in  costs. 

ADGs  seemed  to  perform  perfectly  well  in  accounting  for  the  relationship  between  the  distribution 
of  costs  and  health  status  differences  in  HMO-B  in  this  study.  If  the  bias  that  remains  evident  in 
the  results  for  HMO-A  (see  the,  admittedly  modest,  failure  of  pattern  in  mean  forecasting  bias  for 
ADGs  in  HMO-A  between  groups  of  3,000  and  5,000  in  Table  5.8)  can  be  attributed  to  other  than 
health  status  differences  (in  other  words,  if  the  results  across  the  two  plans  can  be  made  even  more 
comparable  after  addressing  some  inconsistencies  in  the  HMO-A  results),  there  is  evidence  in  these 
results  to  suggest  that  ADGs  are  about  as  good  as  you  can  expect  to  get  in  trying  to  associate  health 
status  differences  with  the  distribution  of  costs  (in  comparable  populations).  Since  ADGs  are, 
essentially,  a  clustering  of  diagnoses,  this  result  should  be  the  case  within  any  health  plan  system 
where  diagnoses  are  so  clearly  tied  to  costs  as  they  are  across  the  two  health  plans  in  this  study.  The 
fact  that  there  were  no  individuals  in  ACG  15  in  the  study  sample  for  either  HMO-A  or  HMO-B 
serves  as  a  reminder  that  patterns  of  diagnosis  may  vary  across  populations  defined  in  other  ways, 
such  as  Medicare  and  Medicaid  population. 
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One  implication  of  the  body  of  results  from  this  study  is  that  the  maximum  R2  may  provide  a 
misleading  perspective  on  the  extent  to  which  risk  adjustment  alternatives  address  selection  bias. 
That  measure  indicates  the  extent  to  which  health  service  costs  are  predictable  for  a  given 
population  between  specific,  typically  few,  points  in  time.  Data  from  a  sufficiently  high  number  of 
points  in  time  should,  theoretically,  provide  a  better  measure  of  what  that  maximum  really  is. 
Unfortunately  for  that  calculation,  other  factors  such  as  changes  in  the  health  care  system  would 
confound  that  measurement.  It  may  be  difficult  for  a  health  plan  to  know  how  to  influence  selection 
effectively  at  any  one  time.  At  the  same,  an  individual-level  R2  may  not  be  the  most  appropriate 
basis  for  comparing  a  risk  adjustment  method  to  that  maximum  because  the  individual  measure 
does  not  reflect  the  population-level  effects  of  applying  any  given  method.  Comparing  an 
individual-level  R2  to  a  maximum  R2  may  provide  too  restricted  a  perspective  on  the  predictability 
of  health  service  costs.  Moreover,  the  results  of  this  study  regarding  the  underlying  distribution  of 
data  suggest  that  the  issue  of  selection  may  be  more  directly  tied  to  assumptions  related  to  that 
distribution,  rather  than  specific  risk  adjustment  methods.  In  that  sense,  the  results  of  this  study 
suggest  that  too  much  importance  may  be  ascribed  to  the  choice  of  risk  adjustment  methods  when  a 
bigger  problem  forestalling  consensus  regarding  the  most  appropriate  methodology  is  the  inability 
to  isolate  the  effects  of  other  sources  of  bias  that  confound  the  assessment  of  alternative  methods. 

If  the  results  of  this  study  regarding  ADGs  are  true,  differences  in  the  relative  performance  of 
ADGs  across  otherwise  comparable  populations  (such  as  health  plans)  could  be  used  to  identify 
bias  that  is  not  related  to  health  status.  In  other  words,  ADGs  could  be  used  not  just  to  control  for 
health  differences  so  that  reimbursement  is  fair  across  comparable  groups,  they  could  also  be  used 
as  a  measure  of  the  extent  of  differences  in  other  factors.  Those  differences  would  include  issues  of 
practice  pattern  and  cost  management. 

There  is  something  circular  in  all  this.  The  ACG  case-mix  system  was  specifically  designed  to  do 
just  that,  though  not  necessarily  in  terms  of  costs  alone.  ADGs  were  categorized  into  ACGs  in 
order  to  present  them  in  a  more  intuitively  agreeable  form  that  providers  might  more  readily  accept 
(and  use).  That  categorization  was  made  with  the  understanding  that  there  would  be  an  associated 
loss  in  explanatory  power.  Once  again,  if  the  relationships  suggested  by  this  analysis  are  true,  the 
extent  of  the  trade-off  that  is  made  between  ACGs  and  ADGs  is  most  clearly  evident  in  the 
difference  in  results  for  those  methods  for  HMO-B.  In  order  to  assess  that  trade-off  using  the  data 
for  HMO-A,  some  accounting  has  to  be  made  of  the  slight  bias  evident  in  those  results  for  ADGs. 
In  either  case,  that  trade  off  is  small  in  absolute  terms.  Knowing  the  extent  of  the  trade-off  in 
HMO-B  should  indicate  the  extent  of  whatever  bias  remains  unaddressed  in  HMO-A.  Identifying 
that  additional  component  of  bias  could  help  explain  differences  between  HMO-A  and  HMO-B 
that  are  not  related  to  health  status  that  is  already  accounted  for  using  those  methods.  Once  the 
effects  of  important  sources  of  bias  are  understood,  it  should  be  possible  to  combine  data  from 
across  health  plans  and  examine  differences  that  then  emerge  in  terms  of  bias  those  differences 
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introduce.  The  reverse  of  that  perspective  is  already  assumed  to  be  true  in  literature  on  profiling 
provider  practice  patterns  (Lasker  et  al.  1992,  McNeil  et  al.  1992).  The  kind  of  group-level 
analysis  used  in  this  study  may  be  useful  in  that  context  as  well. 

In  terms  of  the  relationship  between  individual  and  group-level  measures  that  engendered  this 
analysis,  relative  differences  between  risk  adjustment  methods  that  are  evident  in  individual  versus 
group-level  measures  are  related  to  the  extent  that  any  bias  inherent  in  those  methods  is  related. 
Bias  that  is  unrelated  to  any  specific  method  will  contribute  to  relative  differences  at  both  the 
individual  and  group  levels,  but  that  bias  may  not  be  evident  at  the  individual  level.  ACGs  should 
always  perform  similarly  to  ADGs,  since  the  underlying  basis  for  health  status  differences  is  the 
same  for  both  methods.  A  marked  difference  in  the  relative  performance  of  those  two  methods 
would  be  an  obvious  sign  of  unaccounted-for  bias  in  the  underlying  cost  estimates.  Since  there  is 
less  bias  inherent  in  expectations  based  on  ADGs,  examining  the  difference  in  performance  of 
ADGs  using  actual  dollars  versus  a  more  complicated  data  treatment  that  addresses  the  distribution 
of  the  underlying  data  should  provide  good  information  about  the  relative  difference  due  to 
selection  effects  between  the  population  used  to  generate  the  cost  expectations  and  the  population 
to  which  those  expectations  are  applied. 

Random  groups,  which  are  typically  used  in  analyses  comparing  risk  adjustment  methods,  tend  to 
minimize  differences  between  methods  because  they  minimize  any  form  of  bias,  whether  or  not  it  is 
attributable  to  differences  in  health  status.  In  order  to  address  this  issue,  some  researchers  have 
tried  to  estimate  the  extent  of  bias  by  selecting  groups  of  specific  high-cost  cases,  and  then 
comparing  results  for  alternative  methods  based  on  those  groups.  While  the  effects  of  bias  may  be 
greater,  and  more  evident,  in  high-cost  cases,  it  is  present  throughout  populations  to  varying 
degrees.  Isolating  high-cost  cases  may  not  provide  a  good  overall  measure  of  how  well  a  given  risk 
adjustment  method  controls  for  bias,  either  as  a  stand-alone  method  or  as  compared  to  alternative 
methods,  for  the  same  reason  that  ambulatory-only  and  inpatient  costs  are  treated  separately  in  a 
four-part  data  analysis.  In  any  event,  it  is  at  least  as  important  to  be  sensitive  to  the  sources  of  bias 
in  examining  the  effects  of  risk  adjustment  for  isolated  high-cost  cases  as  it  is  for  the  population  as 
a  whole. 

Finally,  group-level  measures  not  only  improve  with  the  more  complex  four-part  data  treatment, 
but  underlying  estimates  of  total  service  costs  will  be  unnecessarily  biased  if  it  is  not  used  when 
selection  effects  actually  occur.  This  finding  suggests  that  the  rejection  of  hypothesis  2b, 
concerning  the  relative  merit  of  data  treatments,  was  inappropriate.  It  also  emphasizes  the  point 
that  the  individual-level  R2  is  not  a  measure  of  predictive  accuracy.  The  R2  does  seem  to  be  a  good 
measure  of  the  direction  of  differences  between  risk  adjustment  methods,  at  least  between  well- 
established  alternatives.  It  also  provides  some  information  about  the  extent  of  those  differences.  In 
that  sense,  the  R2  is  useful  in  the  development  and  preliminary  assessment  of  methods.  It  is  not 
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adequate,  in  itself,  to  indicate  inherently  population-level  effects,  such  as  the  bias  that  confounded 
the  initial  results  for  HMO-B  in  the  absence  of  the  four-part  log  treatment. 

If  bias  in  the  underlying  cost  values  is  addressed  as  well  as  can  be  accomplished  (the  four-part  log 
treatment  in  this  study),  and  any  remaining  bias  is  accounted  for — somehow — the  relative 
performance  of  alternative  risk  adjustment  methods  should  be  the  same  at  the  group  level,  as  it  is 
measured  at  the  individual  level.  The  chances  that  all  differences  between  two  populations,  or  even 
differences  in  the  same  population  in  measures  taken  over  time,  will  be  accounted  for  are  not  great. 
The  extent  of  any  differences  should  be  apparent  in  the  pattern  of  performance  of  well  established 
risk  adjustment  methods.  The  type  of  analysis  applied  in  this  study  could  be  used  to,  essentially, 
assay  those  differences  in  patterns  of  performance  across  truncation  levels  and  groups  sizes.  The 
ACG  case-mix  system  may  be  particularly  suited  for  this  purpose  since  it  provides  two  closely 
related,  yet  distinct,  case-mix  measures  that  embody  inherently  less  confounding  bias  than 
primarily  demographic-based  methods. 

NOTE  ON  INCLUDING  A  COPAYMENT:  Risk  Differences  Do  Not  Stop 

The  results  of  this  study  suggest  that  individual-level  R2s  are  not  a  good  indication  of  how 
expectations  derived  from  alternative  risk  adjustment  methods  will  perform  at  the  group-level. 
However,  they  are  a  good  indication  of  what  is  reasonable  to  expect  regarding  the  relative 
performance  of  group-level  measures.  Further,  for  reasons  related  to  fundamental  principles  of 
statistical  inference,  the  extent  to  which  group-level  measures  do  not  exhibit  the  relative 
performance  revealed  at  the  individual  level  is  a  measure  of  the  sum  of  the  bias  that  is  not 
addressed  in  the  application  of  those  methods  as  a  whole.  Once  again,  that  bias  may  be  related  to 
the  form  of  the  model  used  to  calculate  the  expectations,  the  relationship  between  specific  risk 
factors  and  the  dependent  measure  embodied  in  the  method  of  adjustment,  and  other  factors  such  as 
issues  of  cost  management. 

A  seemingly  secondary  aspect  of  this  study  may  illustrate  this  point  further.  One  of  several  ways  in 
which  this  study  differs  from  other  recent  works  that  involve  the  comparison  of  alternative  risk 
adjustment  methods  is  that  10  percent  of  the  charges  above  any  given  stoploss  threshold  were 
retained  at  the  individual  level.  That  "copayment"  was  included  because  reinsurance  provided  the 
more  central  frame  of  reference  in  this  study,  and  some  level  of  copayment  would  be  reasonable  to 
expect  in  that  context.  Since  a  more  common  practice  in  the  comparison  of  risk  adjustment 
methods  is  simply  to  truncate  those  charges,  the  calculations  underlying  this  study  were  remade 
based  on  fully-truncated  charges  at  each  stoploss  level  to  get  some  sense  of  the  bias  related  to  the 
decision  to  include  a  copayment.  Appendix  C  includes  a  complete  set  of  individual  and  group-level 
measures  for  this  study  based  on  cost  values  that  included  no  copayment  amount. 
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Tables  5. 10,  5. 1 1,  and  5.12  show  the  measures  for  the  A_G(1),  ACG,  and  ADG(l)  models  that 
were  included  in  Tables  5.4,  5.7,  and  5.8,  respectively,  presented  earlier  in  this  chapter.  Those 
measures  are  presented  in  the  three  lefthand  columns  of  these  latter  tables.  The  three  right-hand 
columns  of  these  tables  reflect  the  same  measures  recalculated  using  no  copayment.  While  a  full 
treatment  of  the  differences  between  the  two  sets  of  columns  in  each  table  could  be  the  basis  for  a 
separate  study  in  itself,  a  few  quick  comparisons  help  illustrate  the  more  central  results  this 
analysis.  Table  5.10  shows  group-level  measure  results  when  actual  dollars  are  used  to  derive  the 
underlying  expectations.  The  three  lefthand  columns  of  Table  5.10  were  taken  from  Table  5.4, 
reflecting  the  use  of  a  copayment.  The  three  right-hand  columns  show  the  same  measures  using  no 
copayment.  In  the  top  half  of  this  table,  each  of  the  measures  of  performance  seem  to  improve 
across  all  models  for  HMO- A,  using  actual  dollars,  when  no  copayment  is  applied.  Mean 
forecasting  bias,  in  particular,  goes  down  for  all  three  models  when  no  copayment  is  applied.  The 
percentage  of  groups  within  5  percent  of  actual  tends  to  go  up  accordingly.  The  reverse  is  generally 
true  for  HMO-B,  shown  at  the  bottom  of  that  table.  Removing  the  copayment  seems  to  accentuate 
the  difference  in  performance  between  the  two  plans.  One  interpretation  is  that  more,  rather  than 
less,  bias  has  been  introduced  by  removing  that  copayment. 

Table  5  .11  reflects  a  similar  comparison,  with  and  with  no  copayment,  but  using  the  four-part  log 
treatment  instead  of  actual  dollars.  In  this  table,  the  leftmost  columns  are  measures  that  were 
reported  earlier  in  Table  5.7.  Performance  measures  for  the  ACG  and  ADG(l)  models  get  worse 
for  HMO-A  at  lower  stoploss  levels  using  the  more  complex  data  treatment  and  no  copayment. 
They  improve  at  all  stoploss  levels  for  the  A_G(1)  model,  but  in  such  a  way  as  to  suggest  that  the 
simple  demographic  model  performs  better  than  the  both  ADG  and  ACG  models  at  lower  stoploss 
levels.  That  result  is  indefensible  given  the  prior  relationship  across  risk  adjustment  methods 
indicated  by  the  individual-level  R2  results.  Results  for  all  the  models  reflect  less  bias  when  a 
copayment  is  applied.  For  HMO-B,  mean  forecasting  bias  tends  to  get  worse  when  no  copayment 
is  applied  but  the  percentage  of  groups  within  5  percent  of  actual  stays  the  same  or  improves.  This, 
again,  reflects  the  introduction  of  bias  with  the  removal  of  the  copayment. 

Table  5.12  shows  that  the  pattern  evident  in  the  results  on  mean  forecasting  bias  by  group  size  still 
"rings  true"  in  the  data  for  HMO-B  using  no  copayment,  though  slightly  less  so  than  when  a 
copayment  is  used.  Mean  forecasting  bias  goes  up  as  group  size  goes  down  without  a  copayment. 
It  might  be  possible  to  get  to  0.0  percent  in  mean  forecasting  bias  when  no  copayment  is  used  with 
the  ADG  model  given  larger  groups.  At  the  same  time,  the  comparable  results  might  actually  get 
worse,  or  less  stable,  for  ACGs  with  larger  groups,  since  mean  forecasting  bias  shown  on  the  right- 
hand  side  of  this  table  goes  up  with  the  largest  group  size  (5,000)  when  it  should,  theoretically,  go 
down. 
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In  any  case,  simply  truncating  high  cost  values  at  the  levels  used  in  this  study  seems  to  contribute  to 
the  sum  of  the  bias  in  the  application  of  risk  adjustment  methods  as  a  whole.  Retaining  some 
information  about  high-cost  cases  produces  better  estimates  of  future  costs.  One  way  to  think 
about  this  is  that  the  goal  in  setting  a  threshold  of  loss  is  to  nunimize  the  undue,  or  exaggerated, 
influence  of  high-cost  individual  cases.  By  simply  truncating  high  costs  associated  with  those 
cases,  some  useful  information  about  the  distribution  of  related  costs  is  lost.  Retaining  some  of 
that  information  on  a  percentage  basis  in  a  copayment  essentially  preserves  the  character  of  the 
distribution,  even  though  the  full  impact  of  the  related  high  costs  is  reduced.  The  intuitive  basis  for 
this  suggestion  is  the  same  as  the  basis  for  suggesting  that  a  copayment  will  ensure  that  providers 
will  retain  the  incentive  to  provide  services  efficiently  above  a  stoploss  threshold.  The  kinds  of 
cases  that  exceed  some  point  of  acceptable  risk,  such  as  a  stoploss  threshold,  affect  the  distribution 
of  risk  throughout  the  full  body,  or  population,  of  related  risks.  Part  of  the  evidence  for  this  is  that 
the  correction  for  bias  in  the  distribution  of  costs  makes  a  noticeable  difference  even  at  the  lowest 
truncation  level  in  this  study.  These  results  reinforce  the  more  central  theme  of  the  broader  body  of 
results  from  this  study  that  it  is  better  to  get  a  measure  of  the  sum  of  the  bias  that  affects  the 
process  of  risk  adjustment  (to  look  at  the  whole)  than  to  focus  on  seemingly  independent  individual 
risks. 
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Table  5.10: 


(a) 


Group-Level  Measures  by  Stoploss  Level,  using  untransformed  dollars, 
with  and  with  no  copayment 


ACG     \  ADGQ) 


Mean  Forecasting  Bias 


50k 

-8.5% 

-5.5% 

-5.6% 

25k 

-6.0% 

-2.9% 

-3.1% 

10k 

-4.2% 

-1.2% 

-1.5% 

5k 

-4.5% 

-1.7% 

-1.9% 

(b)  Mean  Squared  Forecasting  Error 


50k 

0.9% 

0.5% 

0.5% 

25k 

0.5% 

0.3% 

0.3% 

10k 

0.3% 

0.2% 

0.2% 

5k 

0.3% 

0.1% 

0.1% 

With  No  Copaymcm 
A  G(l)  j     ACG     1  ADGO) 

HMO- A 

-7.9% 

-5.0% 

-5.0% 

-5.1% 

-1.9% 

-2.1% 

-2.9% 

0.2% 

-0.1% 

-2.9% 

-0.2% 

-0.3% 

HMO-A 

0.8% 

0.5% 

0.5% 

0.4% 

0.2% 

0.2% 

0.2% 

0.1% 

0.1% 

0.2% 

0.1% 

0.1% 

(c) 


50k 
25k 
10k 
5k 


Percent  of  Groups  Within  5  Percent  of  Actual 


23.3% 

35.0% 

43.3% 

41.7% 

61.7% 

65.0% 

53.3% 

78.3% 

80.0% 

53.3% 

81.7% 

85.0% 

HMO-A 


28.3% 

40.0% 

46.7% 

50.0% 

68.3% 

65.0% 

70.0% 

81.7% 

80.0% 

75.0% 

88.3% 

90.0% 

(d) 


Mean  Forecasting  Bias 


HMO-B 


(«) 


(f) 


50k 

-6.4% 

-2.1% 

-6.8% 

25k 

-8.5% 

-4.6% 

-8.3% 

10k 

-8.4% 

-5.3% 

-7.8% 

5k 

-7.3% 

-4.6% 

-6.7% 

Mean  Squared  Forecasting  Error 


50k 

0.6% 

0.3% 

0.7% 

25k 

0.9% 

0.4% 

0.9% 

10k 

0.8% 

0.4% 

0.7% 

5k 

0.6% 

0.3% 

0.5% 

Percent  of  Groups  Within  5  percent  of  Actual 


50k 

30.0% 

58.3% 

26.7% 

25k 

21.7% 

45.0% 

13.3% 

10k 

16.7% 

41.7% 

16.7% 

5k 

26.7% 

58.3% 

30.0% 

-6.6% 

-2.3% 

-6.9% 

-8.9% 

-5.2% 

-8.6% 

-8.8% 

-6.0% 

-8.1% 

-7.6% 

-5.3% 

-6.8% 

HMO-B 

0.6% 

0.3% 

0.7% 

1.0% 

0.5% 

0.9% 

0.9% 

0.5% 

0.8% 

0.7% 

0.4% 

0.5% 

Actual 

HMO-B 

28.3% 

56.7% 

26.7% 

20.0% 

45.0% 

13.3% 

15.0% 

33.3% 

15.0% 

21.7% 

48.3% 

25.0% 
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Table  5.11:  Group-Level  Measures  by  Stoploss  Level,  4-part  log  treatment,  groups  of  3,000  , 
with  and  with  no  copayment 
With  Copayment 
A  G(t)  !      ACG  _  _  |  ADG(I) 


level 


(a) 


50k 
25k 
10k 
5k 


50k 
25k 
10k 
5k 


(c) 


With  No  Copayment 

1>  :f      ACG  : 


Mean  Forecasting  Bias 


-6.7% 

-5.3% 

-3.8% 

-4.0% 

-2.3% 

-1.1% 

-1.9% 

-0.1% 

0.6% 

-2.4% 

-0.5% 

0.1% 

Mean  Squared  Forecasting  Error 

0.7% 

0.5% 

0.4% 

0.3% 

0.3% 

0.2% 

0.2% 

0.2% 

0.1% 

0.2% 

0.1% 

0.1% 

Percent  of  Groups  Within  5  Percent  of  Actual 


50k 

35.0% 

35.0% 

50.0% 

25k 

55.0% 

65.0% 

80.0% 

10k 

78.3% 

78.3% 

78.3% 

5k 

78.3% 

81.7% 

83.3% 

APG(l) 


HMO- A 

-6.1% 

-4.7% 

-3.3% 

-3.0% 

-1.3% 

-0.2% 

-0.6% 

1.3% 

1.9% 

-0.9% 

1.1% 

1.5% 

HMO- A 

0.6% 

0.4% 

0.3% 

0.3% 

0.2% 

0.2% 

0.1% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

tual 

HMO-A 

38.3% 

41.7% 

55.0% 

65.0% 

75.0% 

76.7% 

78.3% 

75.0% 

71.7% 

86.7% 

80.0% 

78.3% 

111111111 

Mean  Forecasting  Bias 

HMO-B 

50k 

-2.4% 

3.5% 

0.7% 

-2.5% 

3.4% 

0.7% 

25k 

-4.4% 

1.2% 

-0.5% 

-4.8% 

0.7% 

-0.8% 

10k 

-4.4% 

0.5% 

-0.3% 

-5.0% 

-0.3% 

-0.7% 

5k 

-3.8% 

0.6% 

-0.2% 

-4.4% 

-0.4% 

-0.5% 

(«) 


Mean  Squared  Forecasting  Error 


HMO-B 

50k 

0.3% 

0.4% 

0.3% 

25k 

0.4% 

0.2% 

0.2% 

10k 

0.4% 

0.1% 

0.1% 

5k 

0.3% 

0.1% 

0.1% 

0.3% 

0.4% 

0.3% 

0.4% 

0.2% 

0.2% 

0.4% 

0.1% 

0.1% 

0.3% 

0.1% 

0.1% 

(0 


Percent  of  Groups  Withm  5  percent  of  Actual 


50k 

65.0% 

56.7% 

68.3% 

25k 

43.3% 

78.3% 

73.3% 

10k 

51.7% 

81.7% 

80.0% 

5k 

60.0% 

93.3% 

86.7% 

HMO-B 


65.0% 

58.3% 

68.3% 

41.7% 

80.0% 

71.7% 

46.7% 

83.3% 

80.0% 

53.3% 

91.7% 

88.3% 
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Table  5.12:  Group-Level  Measures  by  Group  Size,  4-part  log  treatment,  stoploss  at  $25,000 
with  and  with  no  copayment 


lllllil 

ith  Copayro* 

t'ii:;-....;:. 

Size 

A..G<1) 

ACG 

ADG{1) 

Pillllll 

Mean  Forecasting  Bias 

5000 

-3.2% 

-1.2% 

-0.1% 

3000 

-4.0% 

-2.3% 

-1.1% 

1500 

-3.2% 

-0.8% 

0.5% 

500 

-3.7% 

-1.0% 

0.0% 

(b)              Mean  Squared  Forecasting  Error 

5000 

0.2% 

0.1% 

0.1% 

3000 

0.3% 

0.3% 

0.2% 

1500 

0.6% 

0.7% 

0.5% 

500 

1.6% 

1.5% 

1.3% 

With  No  Copaym«nt 
A  G{1)   S ACG  ;    1  ADG(t) 

HMO-A 

-2.3% 

-0.2% 

0.8% 

-3.0% 

-1.3% 

-0.2% 

-2.3% 

0.2% 

1.4% 

-3.0% 

-0.2% 

0.8% 

HMO-A 

0.2% 

0.1% 

0.1% 

0.3% 

0.2% 

0.2% 

0.5% 

0.6% 

0.5% 

1.4% 

1.3% 

1.2% 

(c) 

Percent  of  Groups  Within  5  Percent  of  Actual 

HMO-A 

5000 

68.3% 

75.0% 

85.0% 

81.7% 

81.7% 

88.3% 

3000 

55.0% 

65.0% 

80.0% 

65.0% 

75.0% 

76.7% 

1500 

46.7% 

51.7% 

58.3% 

45.0% 

53.3% 

58.3% 

500 

18.3% 

38.3% 

38.3% 

25.0% 

40.0% 

41.7% 

(d) 


Mean  Forecasting  Bias 

HMO-B 

5000 

-4.0% 

1.9% 

-0.3% 

-4.4% 

1.4% 

-0.5% 

3000 

-4.4% 

1.2% 

-0.5% 

-4.8% 

0.7% 

-0.8% 

1500 

-4.1% 

1.2% 

-1.0% 

-4.4% 

0.7% 

-1.1% 

500 

-4.2% 

0.8% 

-2.0% 

-4.5% 

0.4% 

-2.2% 

(c) 


Mean  Squared  Forecasting  Error 


5000 

0.2% 

0.1% 

0.1% 

3000 

0.4% 

0.2% 

0.2% 

1500 

0.7% 

0.6% 

0.5% 

500 

1.4% 

1.7% 

1.5% 

HMO-B 

0.3% 

0.1% 

0.1% 

0.4% 

0.2% 

0.2% 

0.7% 

0.5% 

0.5% 

1.4% 

1.6% 

1.4% 

(0 


Percent  of  Groups  Within  5  percent  of  Actual 


5000 

65.0% 

83.3% 

96.7% 

3000 

43.3% 

78.3% 

73.3% 

1500 

41.7% 

41.7% 

45.0% 

500 

33.3% 

28.3% 

33.3% 

HMO-B 


60.0% 

91.7% 

95.0% 

41.7% 

80.0% 

71.7% 

41.7% 

40.0% 

43.3% 

33.3% 

28.3% 

33.3% 
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CHAPTER  6 
SUMMARY  AND  CONCLUSION 

This  study  began  with  the  intention  of  examining  the  impact  of  reinsurance  on  the  process  of 
applying  risk  adjusted  health  service  payments,  or  capitation  rates.  Particular  attention  was  paid  to 
statistical  effects  because  reinsurance  effectively  moderates  the  variation  in  underlying  costs  that 
are  used  to  estimate  future  expenditures  in  such  a  process.  In  the  absence  of  well  established 
models  to  achieve  that  examination,  a  significant  feature  of  the  subsequent  analysis  was  the 
articulation  of  a  methodology  that  can  be  used  to  examine  the  effects  of  any  independent 
component  of  the  broader  process  of  applying  risk  adjusted  payment  rates.  This  method  involves 
building  a  "platform"  for  performance  measures  of  alternative  risk  adjustment  methods,  and 
examining  patterns  in  those  measures  in  light  of  relatively  simple  principles  of  statistical  inference 
regarding  variance,  sample  size,  and  the  estimation  of  means. 

The  essential  premise  of  this  analysis  is  that,  since  expectations  are  sample  mean  values  from  some 
larger  population  of  possible  values,  there  should  be  some  identifiable  relationship  between 
individual  and  group-level  results,  and  that  relationship  should  involve  the  size  of  the  groups. 
Moreover,  the  relative  performance  of  alternative  risk  adjustment  methods  should  be  roughly  the 
same  across  comparable  populations.  Relative  patterns  of  performance  were  first  established  on 
individual-level  measures  across  alternative  risk  adjustment  methods.  Patterns  of  group-level 
measures  of  predictive  accuracy — particularly  across  truncation  levels  (associated  with  stoploss  in 
this  study)  and  group  size — were  then  assessed  as  indicators  of  the  level  of  untreated  (or  remaining) 
bias  in  the  application  of  the  risk  adjusted  expectations.  Bias  was  evident  in  the  extent  to  which 
those  patterns  did  not  conform  to  the  underlying  statistical  principles  and  assumptions.  This  was 
demonstrated  in  the  before  and  after  effects  of  applying  a  four-part  log  transformation  to 
underlying  cost  values,  where  overall  mean  bias  was  reduced  to  close  to  zero  in  both  plans.  This 
was  also  demonstrated  in  the  before  and  after  effects  of  simply  truncating  extreme  cost  values  as 
opposed  to  minimizing  the  influence  of  extreme  cases  in  the  process  of  modeling  coinsurance. 
Simply  truncating  extreme  values  introduced,  rather  than  reduced,  bias  in  underlying  expectations. 

While  the  platform  described  in  this  report  was  primarily  used  to  analyze  group-level  performance 
measures  of  risk  adjusted  expectations  of  costs,  these  results  do  not  suggest  that  analysis  of  group- 
level  measures  should  replace  the  examination  of  individual-level  measures.  On  the  contrary,  the 
group-level  analysis  used  in  this  study  derives  from  the  assumption  that  individual-level  measures 
of  model  performance,  such  as  the  R2,  should,  in  fact,  reveal  the  "true"  relationship  between  risk 
adjustment  alternatives.  Group-level  analysis  provides  an  added  dimension  to  the  examination  of 
the  application  of  risk  adjustment  as  a  whole,  whereas  individual-level  measures  are  a  more  direct 
reflection  of  the  development  and  calculation  of  the  underlying  expectations.  Thus,  those  two 
levels  of  measurement  represent  different  phases  of  the  process  of  risk  adjustment  as  a  whole. 
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STUDY  RESULTS 

Three  sets  of  hypotheses  were  proposed  for  this  study.  Results  based  on  the  first  set  of  hypotheses 
show  that  the  relative  performance  of  four  alternative  risk  adjustment  methods  is  consistent  across 
individual-level  measures  of  that  performance,  and  across  comparable  health  plans.  This  is 
important  because  it  suggests  that  those  methods  are  stable,  and  that  discrepancies  (bias)  evident  in 
relative  patterns  of  performance  should  be  limited,  by  virtue  of  the  principles  noted  above,  to 
differences  that  are  consistent  with  that  relative  performance.  In  other  words,  for  example,  ADGs 
should  always  perform  better  than  simple  demographic  criteria  when  they  are  used  to  establish  risk 
adjusted  expectations  for  populations  comparable  to  those  examined  in  this  study.  Moreover, 
significant  bias  between  expected  and  actual  outcomes  that  is  not  consistent  with  established 
relative  performance  can  be  ascribed  to  sources  other  than  the  risk  adjustment  methods  themselves. 

Once  the  relative  pattern  of  performance  of  alternative  risk  adjustment  methods  was  established,  a 
second  set  of  hypotheses  was  used  to  examine  the  effects  of  stoploss  reinsurance  in  the  context  of 
other  data  treatments  that  are  typically  employed  in  setting  expectations.  That  analysis  showed  that 
more  sophisticated  data  treatments  based  on  log  transformation  and  multipart  modeling  did  not 
significantly  improve  individual-level  measures  of  performance.  Taken  at  face  value,  those  results 
might  be  seen  as  supportive  of  a  current  trend  not  to  apply,  for  example,  log  transformation  in 
comparative  analyses  of  risk  adjustment  methods.  However,  subsequent  analysis  given  a  third  set 
of  hypotheses  defined  in  terms  of  group-level  measures  showed  that  results  based  solely  on 
individual-level  measures  of  model  performance  may  provide  misleading  indications  of  both  the 
necessity  for  specific  data  treatments  and  the  relative  performance  of  alternative  risk  adjustment 
methods. 

Data  Transformation 

Data  from  two  otherwise  comparable  health  plans  were  shown  to  exhibit  comparable  results  on  a 
selection  of  individual-level  measures  of  model  performance,  given  four  alternative  risk  adjustment 
methods.  When  expectations  derived  from  the  individual-level  analyses  were  applied  in  group- 
level  analyses  in  each  respective  plan,  bias  in  the  underlying  expectations  was  made  evident.  More 
precisely,  when  expectations  were  derived  using  dependent  costs  defined  in  actual  dollars,  there 
was  an  overall  discrepancy  between  expected  and  actual  costs  (prediction  error  was  not  centered 
closed  to  zero  in  sufficiently  large  groups),  and  the  relative  relationship  between  risk  adjustment 
alternatives  was  not  necessarily  consistent  between  individual  and  group-level  measures.  This 
result  was  most  apparent  in  the  health  plan  where  selection  differences  were  most  evident  between 
the  underlying  estimation  and  validation  samples.  The  application  of  a  four-part  log  treatment 
removed  most  of  the  bias  that  could  be  attributed  to  the  underlying  distribution,  improved  the 
absolute  results  of  all  group-level  performance  measures,  and  made  the  relative  performance  of 
alternative  risk  adjustment  methods  more  comparable  across  the  study  sites,  regardless  of 
underlying  bias  due  to  selection  effects. 
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The  results  of  this  study  illustrate,  and  reaffirm,  the  earlier  work  of  Duan  et  al.  (1982)  regarding  the 
treatment  of  health  service  cost  data.  An  important  implication  of  these  results  is  that  a  failure  to 
address  problems  inherent  in  the  underlying  distribution  of  the  dependent  cost  data  may  confound 
the  comparison  of  risk  adjustment  alternatives  and,  more  importantly,  produce  biased  expectations 
of  costs.  The  practical  significance  of  these  results  is  a  function  of  the  extent  to  which  selection 
differences  are  a  concern  in  the  distribution  of  health  service  costs. 

The  Effects  of  Reinsurance 

Group-level  analyses  also  provided  an  added  dimension  to  the  examination  of  the  effects  of 
reinsurance.  The  results  show  that  individual-level  measures  of  performance  improve  as  the 
underlying  variation  in  costs  is  reduced  at  each  successively  lower  stoploss  level,  and  that 
differences  between  risk  adjustment  alternatives  on  those  measures  widen  at  lower  levels.  In  other 
words,  increasingly  lower  truncation  levels  make  differences  between  risk  adjustment  methods 
more  distinct.  At  the  same  time,  there  are  no  significant  differences  between  alternatives  methods, 
of  any  group  size,  in  the  percentage  of  groups  that  fall  within  5  percent  of  actual  costs  at  stoploss 
levels  of  $10,000  or  less  in  this  study.  This  is  because  the  reduction  in  underlying  variation  is  so 
dramatic  at  low  stoploss  levels.  Thus,  lower  stoploss  levels  can  remove  the  practical  effect  of 
applying  risk  adjustment  in  setting  payment  rates. 

Traditional  insurance  theory  suggests  that  stoploss  is  intended  to  deal  with  outliers  that  are,  in  any 
other  sense,  unpredictable.  By  contrast,  risk  adjustment  is  used  to  account  for  identifiable  (and, 
therefore,  predictable)  differences  throughout  a  population.  Stoploss  reinsurance  can  be  used  to 
compliment  the  process  of  risk  adjustment  by  moderating  the  extreme  distribution  of  costs.  That 
may  make  the  costs  that  are  subject  to  risk  adjustment  more  predictable.  However,  from  a 
traditional  perspective  regarding  insurance,  it  would  be  inappropriate  to  use  stoploss  to  offset 
available  gains  in  the  explanatory  power  of  alternative  risk  adjustment  methods  because  those  gains 
would  be,  by  definition,  made  at  the  expense  of  information  that  could  be  used  to  explain 
differences  in  costs. 

Other  Results 

Group-level  analysis  was  also  used  to  shed  light  on  the  choice  of  treatments  of  age  when  it  is 
included  as  an  independent  variable  in  setting  payment  rates.  While  individual-level  measures  of 
performance  indicated  that  a  continuous  treatment  of  age  consistently  outperforms  a  categorical 
treatment,  group-level  measures  indicated  that  there  were  only  limited  significant  differences 
between  those  treatments.  This  suggests  that  the  small  amount  of  added  variation  explained  by 
models  based  on  a  continuous  treatment  of  age  (including  the  square  of  age  and  the  age/gender 
interactive  terms  used  in  this  study)  may  not  outweigh  the  practical  value  of  using  age  categories 
that  are  more  consistent  with  existing  actuarial  practice  in  setting  payment  rates. 
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Finally,  a  coinsurance  rate  of  10  percent  was  used  in  calculating  underlying  expectations  in  this 
study  because  it  was  consistent  with  the  broader  context  of  reinsurance.  Since  many  existing 
comparative  analyses  do  not  account  for  coinsurance,  separate  analyses  were  also  conducted  using 
simple  truncation  at  each  stoploss  level.  Group-level  measures  revealed  that  a  noticeable  level  of 
additional  bias  was  introduced  when  simple  truncation  was  applied  to  the  underlying  data. 
Including  some  information  about  extreme  cost  values  on  a  percentage  basis  in  the  calculation  of 
underlying  cost  expectations  seems  to  preserve  the  character  of  the  distribution  of  the  underlying 
data  and,  therefore,  produce  better  expectations  of  future  expenditures.  When  simple  truncation  is 
used  in  the  comparison  of  alternative  risk  adjustment  methods,  that  comparison  may  be  subject  to 
confounding  bias  related  to  the  treatment  of  the  data  (and  not  the  risk  adjustment  methods 
themselves). 

FURTHER  IMPLICATIONS 

Most  recent  research  in  the  area  of  risk  adjustment  has  focused  on  comparing  the  relative  merits  of 
alternatives  methods,  and  on  refining  existing  methods.  The  type  of  analysis  used  in  this  study  may 
help  clarify  that  comparison  and,  by  extension,  help  identify  the  scale  of  advances  that  are  made 
with  the  introduction  of  new  methods.  Current  research  in  the  area  of  risk  adjustment  also  tends  to 
focus  on  the  comparison  of  alternative  methods  within  given  populations,  such  as  a  health  plan. 
One  implication  of  the  findings  in  this  study  is  that  it  may  be  useful  to  examine  the  relative  effects 
of  alternative  risk  adjustment  methods  across  health  plans.  Such  comparisons  could  improve  the 
process  of  risk  adjustment  as  a  whole  by  providing  a  means  to  calibrate  the  performance  of 
methods  in  different  settings,  and  serve  as  a  basis  for  examining  the  extent  of  differences  that  exist 
across  populations.  The  type  of  group-level  analysis  used  here  could  be  used  as  a  crude  platform  to 
decompose  the  sum  of  bias  related  to  independent  aspects  of  that  process.  Once  the  effects  of  very 
broadly  influential  factors,  such  as  the  treatment  of  the  underlying  data  or  the  limitation  of  risk 
through  reinsurance,  are  better  understood,  it  may  be  possible  to  discern  more  subtle  effects.  Those 
effects  may,  for  example,  be  attributable  to  systemic  factors,  such  as  differences  in  data  systems  or 
variations  in  patterns  of  diagnosis  related  to  how  coverage  is  defined  (benefit  packages). 

One  way  in  which  this  study's  results  might  be  used  is  in  the  context  of  retrospective  adjustment  of 
prospective  payment  rates.  Newhouse  (1986,  1994)  has  suggested  that  health  service  payments 
may  need  to  be  based  on  a  blend  of  risk  adjusted  prospective  payment  and  the  reconciliation  of 
actual  costs  during  a  rating  period — a  method  referred  to  as  partial  capitation.  One  example  might 
be  that  providers  would  receive  40  percent  of  a  risk  adjusted  capitation  rate,  and  then  get 
reimbursed  for  60  percent  of  subsequent  actual  costs.  That  reconciliation  might  obscure  any 
differences  between  alternative  risk  adjustment  methods  in  the  same  way  that  a  very  low  stoploss 
threshold  limits  the  practical  differences  between  methods.  Aside  from  the  administrative  difficulty 
of  accounting  for  each  provider's  actual  costs,  this  type  of  reimbursement  scheme  might  undermine 
the  incentives  for  efficiency  and  fairness  that  risk  adjustment  is  intended  to  address. 
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Alternatively,  the  platform  described  in  this  study  could  be  used  to  assess  the  application  of 
capitation  rates  on  the  provider  community  as  a  whole,  based  on  a  sample  of  data  drawn  from 
participating  providers.  A  retrospective  adjustment  could  be  made  on  the  basis  of  an  accounting 
for  the  influence  of  significant  components  of  the  broader  process  of  applying  those  rates.  For 
example,  both  plans  in  this  study  exhibited  close  to  a  4  percent  bias  given  stop  loss  at  $25,000, 
regardless  of  group  size,  when  age  and  gender  were  used  alone  to  establish  payment  rates.  This 
approach  would  establish  a  population-level  basis  for  any  retrospective  adjustment  that  is  evidently 
appropriate,  rather  than  build  on  actual  costs  that  have  historically  been  the  basis  of  fee-for-service 
care  with  its  attendant  lack  of  incentive  for  the  efficient  management  of  the  provision  of  health 
services. 

Ultimately,  it  may  be  possible  to  isolate  the  effects  of  differences  in  provider  practice  and  cost 
management.  Since  ADGs  reflect  the  distribution  of  diagnoses  in  a  population,  the  fact  that  they 
performed  so  consistently  across  two  independent  populations  suggests  that  the  pattern  of  making 
diagnoses  is  similar  in  the  two  plans.  Although  it  raises  the  issue  of  comparability  of  benefit 
packages,  the  consistency  in  patterns  of  diagnosis  is  also  evident  in  the  fact  that  neither  plan  had 
individuals  in  ACG  15  in  this  study.  One  long  range  implication  of  this  finding  is  that,  getting  a 
better  understanding  of  how  patterns  of  diagnoses  (perhaps  captured  in  terms  of  ADGs)  vary 
across  health  plans  may,  eventually,  make  it  possible  to  decompose  bias  associated  with  practice 
patterns  that  is  evident  in  group-level  measures. 

The  results  of  this  study  indicate  that  ADGs  have  the  least  inherently  biased  relationship  to  the 
distribution  of  health  service  costs  of  the  methods  examined.  It  is  important  to  remember,  however, 
that  the  results  of  this  study  are  most  relevant  to  a,  largely,  employed  population  that  includes 
dependents.  Moreover,  if  diagnoses  were  the  basis  of  risk  adjustment  used  to  set  expectations  of 
costs,  there  is  some  potential  that  patterns  of  diagnoses  may  change  as  a  result.  Still,  the  results 
based  on  ADGs  in  this  study  suggest  that  using  the  information  embodied  in  diagnoses  is  the  key  to 
providing  otherwise  unbiased  adjustment  for  health  status  differences  related  to  costs  across 
populations. 

LIMITATIONS 

Clearly,  the  application  of  the  type  of  group-level  analysis  used  in  this  study  will  require 
considerable  refinement  before  it  can  be  used  for  more  sophisticated  purposes.  Each  of  the  primary 
factors  examined  as  a  source  (or  indicator)  of  bias  here — data  treatment  method,  truncation  level, 
group  size,  risk  adjustment  method,  and  specific  performance  measures — could  be  examined 
further  to  learn  more  about  the  character  of  their  effects.  For  example,  bias  is  evidently  introduced 
in  the  data  for  HMO-B  when  the  stoploss  level  is  increased  to  $50,000.  Is  that  bias  related  to  the 
level  of  copayment  used  in  the  study?  If  so,  what,  if  any,  is  the  relationship  between  truncation 
level  and  the  percentage  of  extreme  values  that  is  retained  in  the  data?  In  terms  of  performance 
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measures,  mean  forecasting  bias  may  reveal  differences  across  high  and  low-use  groups  (defined 
either  in  terms  of  actual  or  expected  values)  that  indicate  something  about  the  nature  of  untreated 
bias.  There  is  the  hint  of  evidence  in  the  results  for  this  study  that  ACGs  may  control  for  bias 
associated  with  the  underlying  distribution  of  costs  in  some  way  that  is  different  than  the  other 
methods  included  in  this  analysis. 

In  general,  the  relationship  between  the  underlying  distribution  of  costs  and  the  independent 
components  of  the  broader  process  of  applying  risk  adjusted  health  service  payments  was  examined 
in  terms  of  bias  in  this  analysis.  It  is  not  clear  that  it  is  even  possible  to  control  the  distribution  of 
costs  well  enough  to  isolate  very  subtle  effects.  That  is,  in  part,  because  it  is  not  clear  that  other 
important  potential  sources  of  bias  can  be  controlled  sufficiently  to  apply  this  type  of  analysis 
across  key  components  of  a  broadly  defined  health  care  system.  Factors  such  as  variation  in  the 
definition  of  benefit  packages,  data  systems,  and  access  to  information  confound  that  broader 
assessment. 

There  is  a  critical  underlying  assumption  implicit  throughout  this  analysis  which  is  that  patterns  of 
diagnoses  are  more  stable  than  variation  in  these  other  factors.  That  is  to  suggest  that,  if  everything 
else  is  held  constant  most  physicians  will  define  (though  not  necessarily  treat)  most  illness  in  the 
same  way.  That  is  not  to  suggest  that  the  opinions  of  physicians  do  not  vary,  but  that  there  are 
evolving  standards  of  practice  that  stabilize  that  variation  in  more  effective  ways  than  other  factors 
stabilize  the  distribution  of  health  service  costs.  The  deeper  significance  of  patterns  of  diagnosis 
are  beyond  the  scope  of  this  analysis. 

There  are  other  limitations  of  this  study  for  the  application  of  risk  adjustment  for  reimbursement. 
The  study  population  was  limited  to  health  plan  members  who  were  enrolled  for  the  full  study 
period.  Those  who  join  and  those  who  leave  a  health  plan  contribute  important  sources  of 
variability  to  the  distribution  of  costs.  Those  who  join  with  no  prior  exposure  to  a  given  health 
plan  might  fall  anywhere  along  a  spectrum  of  risk.  The  variability  of  the  effects  of  those  who  leave 
a  plan  will  be  related  to  factors  such  as  whether,  and  why,  they  disenroll  or  whether  they  die.  It  is 
not  clear  from  this  study  what  the  limits  and  consequences  of  those  effect  will  be. 

There  is  also  a  limitation  related  to  scale  in  this  study.  It  is  not  clear  how  important  the  effects 
examined  in  this  study  will  prove  to  be  across  very  large  units  of  analysis.  Once  again,  more 
information  is  needed  about  the  extent  of  differences  that  actually  exist  between  relevant 
populations,  and  the  nature  of  those  differences.  The  more  immediate  value  of  this  study  may  be  in 
the  area  of  profiling  risk  adjusted  differences  across  various  units  of  analysis  rather  than  in 
reimbursement.  Although  the  effects  of  bias  on  retrospectively  derived  expectations  were  not 
examined  here,  the  implications  of  the  study  results  that  alternative  sources  of  bias  may  confound 
such  expectations  are  much  the  same.  For  example,  a  recent  analysis  comparing  the  use  of  simple 
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demographic  factors  to  ACGs  in  profiling  primary  care  physician  resource  use  did  not  specifically 
address  the  underlying  distribution  of  the  costs  used  to  establish  risk  adjusted  expectations  (Tucker 
et  al.  1996).  While  that  analysis  emphasized  the  instability  of  risk  adjusted  profiling  indicators  due 
to  physician  panel  size,  the  study  results  were  biased  in  the  comparison  of  the  risk  adjustment 
methods  to  the  extent  that  those  results  reflect  bias  in  the  underlying  distribution  of  costs,  and  not 
differences  inherent  in  the  risk  adjustment  methods.  The  results  of  this  study  also  raise  questions 
about  the  nature  and  extent  of  bias  that  is  inherent  in  a  risk  adjustment,  and  that  contributes  to  the 
comparison  of  alternative  methods. 

CONCLUSION 

There  is  a  sense  in  which  this  research  (and  the  accompanying  analysis)  constitutes  a  review  of 
well-established  theory  from  several  disciplines.  An  actuary  would  recognize  the  conclusions 
regarding  the  application  of  stoploss  in  the  process  of  risk  adjustment.  A  statistician  would 
recognize  the  implications  for  underlying  inferences  in  the  treatment  of  dependent  cost  values.  A 
health  service  researcher  would  accept  the  suggestion  that  risk  adjustment  might  be  used  to  discern 
variation  in  performance  beyond  that  attributable  to  health  status  alone,  since  that  is  its  essential 
purpose.  What  is  different  about  this  analysis  is  that  it  is  an  attempt  to  assimilate  aspects  of  those 
disciplines  to  examine  the  process  of  applying  risk  adjustment  for  payment  as  a  whole.  Moreover, 
the  results  of  this  study  suggest  that  existing  risk  adjustment  methods  may  already  be  adequate  to 
approach  the  broader  purpose  of  accounting  for  health  status  differences,  at  least  in  terms  of  bias  in 
the  distribution  of  costs,  and  that  techniques  for  group-level  analysis  exist  to  facilitate  the 
assessment  of  that  distribution. 

At  the  end  of  the  first  chapter  of  this  report,  several  analysts  were  noted  for  suggesting  that  it  is 
time  to  shift  the  focus  of  research  from  a  search  for  the  perfect  risk  adjustment  method  to  the 
process  of  (applying)  risk  adjustment.  Perhaps  the  most  important  implication  of  this  study  is  that, 
in  order  to  understand  the  process  of  risk  adjustment  as  a  whole,  it  will  be  necessary  to  shift  from  a 
focus  on  individual-level  measures  of  performance  to  group-level  measures  that  reveal  more 
information  about  that  broader  process.  The  population  orientation  of  both  insurance  theory  and 
principles  of  statistical  inference  may  help  lay  the  foundation  for  understanding  why  it  is  important 
to  approach  risk  adjustment  as  a  process  affecting  populations — rather  than  to  focus  on  individual- 
level  risks — as  a  means  to  control  health  care  costs  and,  by  extension,  to  improve  the  distribution 
of  (access  to)  health  care  services. 

One  question  that  may  hover  in  the  mind  of  an  economist  is,  how  much  of  any  of  this  will  make  a 
real  difference  in  how  health  plans,  or  providers  operate?  The  quickest  response  is  that  it  is  not 
clear  from  this  study,  but  that  further  refinement  of  the  type  of  group-level  analysis  used  here  may 
make  the  limitations  of  the  findings  in  this  study  more  clear.  In  any  event,  the  results  of  this  study 
suggest  that,  in  order  to  be  able  to  assess  differences  due  to  selection  across  health  plans  in  terms  of 
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market  economic,  for  example,  it  is  important  to  be  sure  that  even  more  fundamental  effects  related 
to  basic  statistical  principles  do  not  confound  that  assessment.  Differences  that  are  evident  across 
health  plans,  or  across  geographic  areas — across  any  particular  aggregation  of  individuals — need  to 
be  understood  in  terms  of  very  basic  statistical  effects  before  differences  that  emerge  at  the  level  of 
market  economics  can  be  fully  appreciated.  In  Chapter  5,  the  point  was  made  that  it  may  be 
possible  to  define  the  trade-off  between  some  defined  level  of  success  on  a  summary  measure  of 
performance  and  the  level  of  bias  that  underlies  that  success.  The  results  of  this  study  illustrate 
that  both  of  those  perspectives  are  important,  and  that  considering  one  to  the  exclusion  of  the  other 
may  mask  a  relevant  source  of  bias.  Group-level  analysis  of  the  effects  of  independent  components 
of  the  process  of  applying  risk  adjusted  payment  rates  may  provide  a  means  to  articulate  the 
consequences  of  each  perspective  on  the  provision  of  health  care  services  more  generally. 

Finally,  there  is  the  seed,  in  the  results  of  this  study,  for  understanding  why  it  is  important  that 
selected  components  of  the  broadly  defined  health  care  system  be  standardized  across  that  system 
One  key  to  understanding  the  distribution  of  resources,  and  eventually  controlling  costs  in 
particular,  is  to  establish  minimum  data  standards  that  would  (at  a  minimum)  make  it  possible  to 
examine  the  distribution  of  resource  use  at  the  level  of  diagnosis  throughout  the  system.  A 
minimum  set  of  data  can  provide  a  useful  baseline  of  information  about  that  distribution  and 
facilitate  the  development  of  standards  for  communicating  more  detailed  information  in  the  future 
(once  the  true  trade-off  between  the  potential  and  the  threat  of  ready  access  to  information  is  more 
clear).  Clearly  there  are  tremendously  important  privacy  considerations  that  need  to  be  made. 
There  are  also  important  limitation  to  the  application  of  diagnoses  as  the  basis  for  allocating 
resources.  Nevertheless,  the  intimacy  of  the  relationship  between  physicians,  patients,  and  the 
diagnostic  process  may  be  the  most  stable  element  of  the  health  care  system  as  a  whole.  The 
absence  of  effective  data  standards  and  the  availability  of  data  may  be  the  most  critical  obstacles  to 
achieving  consensus  on  the  most  appropriate  method  for  adjusting  health  service  payments  and, 
again  by  extension,  to  the  equitable  distribution  of  health  service  reimbursement  that  might  help 
assure  access  to  services  for  those  who  represent  poor  financial  risks  in  an  increasingly  competitive 
health  service  market. 
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APPENDIX  A 
The  Johns  Hopkins  Ambulatory  Case-Mix  System 

Case-Mis  Groupings 


1.1 
1.2 


Ambulatory  Care  Groups 
Ambulatory  Diagnostic  Groups 
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Ambulatory  Care  Groups  (ACGs) 


01  Acute:  Minor,  Age  <1 

02  Acute:  Minor,  Age  2-5 

03  Acute:  Minor,  Age  6+ 

04  Acute:  Major 

05  Likely  to  Recur,  without  Allergies 

06  Likely  to  Recur,  with  Allergies 

07  Asthma 

08  Chronic  Medical,  Unstable 

09  Chronic  Medical,  Stable 

10  Chronic  Specialty 

1 1  Ophthalmological/Dental 

12  Chronic  Specialty,  Unstable 

1 3  Psychosocial,  without  Psychosocial  Major 

14  Psychosocial  with  Psychosocial  Major,  without  Psychosocial  Minor 

1 5  Psychosocial  with  Psychosocial  Major,  with  Psychosocial  Minor 

16  Prevenuve/Administrative 

17  Pregnancy 

1 8  Acute  Minor  and  Acute  Major 

19  Acute  Minor  and  Likely  to  Recur  Discrete,  Age  <1 

20  Acute  Minor  and  Likely  to  Recur  Discrete,  Age  2-5 

2 1  Acute  Minor  and  Likely  to  Recur  Discrete,  Age  >5,  Without  Allergy 

22  Acute  Minor  and  Likely  to  Recur  Discrete,  Age  >5,  With  Allergy 

23  Acute  Minor  and  Chronic  Medical:  Stable 

24  Acute  Minor  and  Eye/Dental 

25  Acute  Minor  and  Psychosocial  Without  Psychosocial  Major 

26  Acute  Minor  and  Psychosocial  With  Psychosocial  Major,  Without  Psyc.  Minor 

27  Acute  Minor  and  Psychosocial  with  Psychosocial  Major  and  Minor 

28  Acute  Major  and  Likely  to  Recur  Discrete 

29  Acute  Minor/Acute  Major/Likely  to  Recur  Discrete,  Age  <2 

30  Acute  Minor/Acute  Major/Likely  to  Recur  Discrete,  Age  2-5 

3 1  Acute  Minor/Acute  Major/Likely  to  Recur  Discrete,  Age  6- 1 1 

32  Acute  Minor/Acute  Major/Likely  to  Recur  Discrete,  Age  >5,  Without  Allergy 

33  Acute  Minor/Acute  Major/Likely  to  Recur  Discrete,  Age  >5,  With  Allergy 

34  Acute  Minor/Likely  to  Recur  Discrete/Eye  &  Dental 

35  Acute  Minor/Likely  to  Recur  Discrete/Psychosocial 

36  Acute  Minor/Acute  Major/Likely  to  Recur  Discrete/Eye  &  Dental 

37  Acute  Minor/Acute  Major/Likely  to  Recur  Dis/Psychosocial 

38  2-3  Other  ADG  Combos,  Age  <1 7 

39  2-3  Other  ADG  Combos,  Males  Age  17-34 

40  2-3  Other  ADG  Combos,  Females  Age  1 7-34 

41  2-3  Other  ADG  Combos,  Age  >34 

42  4-5  Other  ADG  Combos,  Age  <  1 7 

43  4-5  Other  ADG  Combos,  Age  1 7-44 

44  4-5  Other  ADG  Combos,  Age  >44 

45  6-9  Other  ADG  Combos,  Age  <60 

46  6-9  Other  ADG  Combos,  Age  6- 1 6 

47  6-9  Other  ADG  Combos,  Males  Age  17-34 

48  6-9  Other  ADG  Combos,  Females  Age  1 7-34 

49  6-9  Other  ADG  Combos,  Age  >34 

50  1 0+  Other  ADG  Combos 

5 1  No  Visits  and/or  No  ADGs 

52  No  Charges 
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Ambulatory  Diagnostic  Groups  (ADGs) 


1  Time  Limited:  Minor 

2  Time  Limited:  Minor-Primary  Infections 

3  Time  Limited:  Major 

4  Time  Limited:  Major-Primary  Infections 

5  Allergies 

6  Asthma 

7  Likely  to  Recur:  Discrete 

8  Likely  to  Recur:  Discrete-Infections 

9  Likely  to  Recur.  Progressive 

10  Chronic  Medical:  Stable 

1 1  Chronic  Medical:  Unstable 

12  Chronic  Specialty:  Stable-Orthopedic 

13  Chronic  Specialty:  Stable-Ear,Nose,Throat 

14  Chronic  Specialty:  Stable-Eye 

15  Chronic  Specialty:  Stable-Other 

16  Chronic  Specialty:  Unstable-Orthopedic 

17  Chronic  Specialty:  Unstable-Ear,  Nose,  Throat 

18  Chronic  Specially:  Unstable-Eye 

19  Chronic  Specialty:  Unstable-Other 

20  Dermatologic 

21  Injuries/Adverse  Effects:  Minor 

22  Injuries/Adverse  Effects:  Major 

23  Psychosocial:  Major 

24  Psychosocial:  Other 

25  Psychophysiologic 

26  Signs/Symptoms:  Minor 

27  Signs/Symptoms:  Uncertain 

28  Signs/Symptoms:  Major 

29  Discretionary 

30  See  and  Reassure 

3 1  Prevention/Administrative 

32  Malignancy 

33  Pregnancy 

34  Dental 
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APPENDIX  B 
Summaries  for  Primary  Analysis 


1  Preparation  Summaries 

Regression  Parameters,  Stoploss  at  $25,000,  HMO-A 

One-part  model  using  actual  dollars    145 

Four-part  model  using  the  log  of  dollars  —  Ambulatory-Only  Users    1 50 

Four-part  model  using  the  log  of  dollars  —  Inpatient  Users    157 

Mean  Expected  Values  (By  Stoploss  Level) 

HMO-A    162 

HMO-B    163 

Mean  Expected  Values  Before  Smearing  (By  Stoploss  Level) 

HMO-A    164 

HMO-B    166 

Smearing  Factors  (By  Stoploss  Level) 

HMO-A    168 

HMO-B    169 

2  Individual-level  Analysis  Summaries  (By  Stoploss  Level) 
Adjusted  R2  Values 

HMO-A    170 

HMO-B    171 

Mean  Absolute  Error,  validation  sample  (HMO-A  &B)    172 

Standard  Deviation  of  Absolute  Error,  validation  sample  (HMO-A  &  B)    173 

Percent  of  Absolute  Error  within  $25,  validation  sample  (HMO-A  &B)    174 

Percent  of  Absolute  Error  within  $50,  validation  sample  (HMO-A  &B)    175 

Percent  of  Absolute  Error  more  than  $400,  validation  sample  (HMO-A  &  B)    1 76 

3  Group-level  Analysis  Summaries  (By  Stoploss  Level) 

Mean  Forecasting  Bias 

HMO-A    177 

HMO-B    179 

Mean  Squared  Forecasting  Error 

HMO-A    181 

HMO-B    183 

Percent  of  Groups  with  5%  of  Actual 

HMO-A    185 

HMO-B    187 

Selected  Statistical  Tests  of  Significance 

One-part  model  using  actual  dollars,  Groups  of  3,000    189 

Four-part  model  using  the  log  of  dollars    191 


NOTE: 


Figures  reflecting  actual  values,  expected  values,  and  absolute  error 
are  reported  in  dollars  ($)  per-member-per-month 


One-part  model  using  actual  dollars 


145 


DEPENDENT  VARIABLE:   Year-2  Charges 
DEPENDENT  VARIABLES:   Year-1  A,   G,   A*G,    &  A* A 


HMO -A 


Source 


DF 


Sum  of 
Squares 


Mean 
Square 


F  Value 


Prob>F 


Model 

4  19781681. 

004  4945420.2509 

130.264 

0.00' 

Error 

12000  455574397 

.12  37964.53309 

C  Total 

12004  475356078 

.  12 

Root 

MSE 

194.84489 

R- square 

0.0416 

Dep  Mean 

72.59842 

Adj  R-sq 

0. 0413 

C.V. 

268 . 38725 

Parameter 

Estimates 

Parameter 

Standard  T 

for  HO: 

Variable 

DF 

Estimate 

Error  Parameter=0 

Prob 

>  |T| 

INTERCEP 

1 

33.442741 

6. 12787976 

5.  457 

0 

.0001 

AGE  91 

1 

-0. 943596 

0.38256057 

-2. 467 

0 

.0137 

GENDER 

1 

14.375580 

7.04759683 

2  .  040 

0 

.  0414 

CR0SS91 

1 

0.245718 

0.20524319 

1.  197 

0 

.2313 

ASQ91 

1 

0. 047686 

0. 00577044 

8.264 

0 

.  0001 

DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:   Year-1  Age  &  Gender  Categories 


Source 

Model 
Error 
C  Total 


DF 


Sum  of 
Squares 


Mean 
Square 


8  16267133.254  2033391.6568 
11996  459088944.87  38270.16880 
12004  475356078.12 


F  Value 
53.133 


Prob>F 
0. 0001 


Root  MSE 
Dep  Mean 
C.V. 


195. 62763 
72.59842 
269. 46542 


R-square 
Adj  R-sq 


0. 0342 
0. 0336 


Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  >  |T| 

INTERCEP 

1 

64. 314737 

4. 

26961043 

15. 063 

0. 0001 

AGE1 

1 

-19. 640529 

16. 

48742007 

-1.191 

0.2336 

AGE  2 

1 

-35.402695 

7. 

58635981 

-4. 667 

0. 0001 

AGE  3 

1 

-49.736328 

6. 

46971674 

-7.688 

0. 0001 

AGE  4 

1 

-37.315470 

6. 

99071410 

-5.338 

0. 0001 

AGE  5 

1 

-6. 804042 

5. 

15066166 

-1.321 

0. 1865 

AGE  7 

1 

52. 605668 

5. 

56438155 

9.454 

0. 0001 

AGE  8 

1 

121. 018530 

17. 

04452121 

7  .  100 

0. 0001 

GENDER 

1 

21.045853 

3. 

57579764 

5.886 

0.0001 

One-part  model  using  actual  dollars 
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DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:      Year-1  ADGs,   A,   G,   A*G,   A* A 


HMO -A 


Source 

Model 
Error 
C  Total 


DF 


Sum  of 
Squares 


Mean 
Square 


38  50926662.501  1340175.3290 
11966  424429415.62  35469.61521 
12004  475356078.12 


F  Value 
37 .784 


Prob>F 
0. 0001 


Root  MSE 
Dep  Mean 
C.V. 


188.33379 
72.59842 
259.41859 


R-square 
Adj  R-sq 


0. 1071 
0. 1043 


Parameter  Estimates 


Parameter 

btanaara 

T  for  HO: 

V  a. J_  1  aDl  e 

JJr 

Estimate 

Error 

Parameter=0 

ir  JTOD 

1  T  1 

1  -1-  1 

TMTTTP  PTT  P 

X 

10. 

975193 

o  . 

O  /  oy  Z.  HD o 

1 

.  597 

o 

i  i  ha 

X  X  U  H. 

2\  riT?  Q 1 

1 
X 

-1. 

046676 

v . 

ijjlo  loo 

-2 

.  662 

o 

n  n*7  ft 

VJ  vj  /  o 

X 

3. 

441929 

O  • 

mcoi  •a  n  n 

0 

.501 

n 

OIDD 

\^i\<JZy  o  _?  ± 

1 
X 

0. 

001364 

u . 

0 

.  007 

o 

QQZ1 

7\  OAQ1 
HOy^  J. 

"1 
X 

0. 

039494 

n 

UUJu jo  ID 

6 

.701 

o 

VJ  VJ  VJ  _L 

j-\A-j\jy  j.  u  _i_ 

X 

3. 

524866 

A 

H  . 

OZ  y  Z  X  X  Z  D 

0 

.761 

0 

A  A  £.A 
H  H  D  H 

nnrqi 09 

1 
X 

11. 

203700 

r> 

o  . 

/  OUODXU 

2 

.  818 

o 

ClOA  ft 

vj  vj  i  o 

nnf^Q  1 

1 
X 

24. 

773452 

D  . 

Z  D OH OZZZ 

3 

.  960 

n 
u . 

u  u  v  x 

J-\±J\J  _7  1U1 

1 

X 

8. 

968810 

c 

D  • 

1 

.347 

o 

X  /  /  _/ 

ADG9105 

1 

10. 

936624 

11. 

28392472 

0 

.  969 

0. 

3325 

ADG9106 

1 

9. 

617915 

10. 

60990330 

0 

.907 

0. 

3647 

ADG9107 

1 

20. 

432519 

5. 

45934111 

3 

.743 

0. 

0002 

ADG9108 

1 

-1. 

132207 

4. 

47020380 

-0 

.253 

0. 

8001 

ADG9109 

1 

66. 

450347 

17. 

6752727 4 

3 

.760 

0. 

0002 

ADG9110 

1 

13. 

036269 

5. 

07812504 

2 

.567 

0. 

0103 

ADG9111 

1 

63. 

990223 

6. 

73045285 

9 

.  508 

0. 

0001 

ADG9112 

1 

33. 

464763 

13. 

62013426 

2 

.  457 

0. 

0140 

ADG9113 

1 

-17. 

671665 

19. 

96285719 

-0 

.885 

0  . 

3761 

ADG9114 

1 

6. 

999999 

5. 

46338007 

1 

.281 

0. 

2001 

ADG9115 

1 

16. 

474615 

50. 

70799688 

0 

.325 

0. 

7453 

ADG9116 

1 

8. 

685557 

14. 

53411500 

0 

.598 

0. 

5501 

ADG9117 

1 

8. 

327588 

8. 

31797468 

1 

.  001 

0. 

3168 

ADG9118 

1 

38. 

119650 

12. 

57527803 

3 

.031 

0. 

0024 

ADG9119 

1 

35. 

090608 

22. 

12081980 

1 

.  586 

0. 

1127 

ADG9120 

1 

5. 

523975 

7. 

31215682 

0 

.755 

0. 

4500 

ADG9121 

1 

5. 

884204 

5. 

19993086 

1 

.  132 

0. 

2578 

ADG9122 

1 

9. 

088078 

5. 

76580814 

1 

.576 

0. 

1150 

ADG9123 

1 

21. 

909563 

9. 

00541905 

2 

.433 

0. 

0150 

ADG9124 

1 

86. 

453919 

24. 

78192895 

3 

.489 

0. 

0005 

ADG9125 

1 

12. 

297562 

10. 

14298139 

1 

.212 

0. 

2254 

ADG9126 

1 

8. 

077221 

4. 

70256392 

1 

.718 

0. 

0859 

ADG9127 

1 

20. 

054744 

6. 

29470910 

3 

.186 

0. 

0014 

ADG9128 

1 

18. 

092207 

4. 

30098065 

4 

.207 

0. 

0001 

ADG9129 

1 

16. 

668552 

6. 

28829992 

2 

.651 

0. 

0080 

ADG9130 

1 

25. 

642716 

14. 

13534603 

1 

.814 

0. 

0697 

ADG9131 

1 

4. 

013634 

3. 

99125331 

1 

.  006 

0. 

3146 

ADG9132 

1 

51. 

052453 

9. 

15855247 

5 

.574 

0. 

0001 

ADG9133 

1 

118. 

315785 

10. 

59892505 

11 

.  163 

0. 

0001 

ADG9134 

1 

-7  . 

703634 

31. 

95462288 

-0 

.241 

0. 

8095 

One-part  model  using  actual  dollars 
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DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:      Year-1  ADGs  &  A/G  Categories 


Source 

Model 
Error 
C  Total 


DF 


Sum  of 
Squares 


Mean 
Square 


42  48946919.825  1165402.8530 
11962  426409158.30  35646.97862 
12004  475356078.12 


Root  MSE 
Dep  Mean 
C.V. 


188.80407 
72.59842 
260. 06638 


R-square 
Adj  R-sq 


F  Value 
32.693 


0. 1030 
0. 0998 


HMO -A 

Prob>F 
0.0001 


Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

ip  i  pi  h  1  p> 

V  CA  J_  _L  QUI  "3 

DF 

Estimate 

Parameter=0 

Prob  > 

i  T  1 

J_1N  J.  i_i  x\  v^iii  ir 

i 

X 

26. 

443694 

*i 

S*3£1  Q009 

5 

.829 

0 . 

0001 

-16. 

997422 

1  P, 

X  \J 

4  0  fici74  S7 

• 1UUJ / 1 J / 

-1 

.  036 

0 . 

3002 

1 

X 

-18. 

841890 

7 

ft  64 

•  QUI JvUJl 

-2 

.  396 

0 . 

0166 

AGE  3 

1 

-28  . 

960178 

5 

55110665 

-4 

.  421 

0 . 

0001 

AGE  4 

1 

-18. 

259970 

9327 1419 

-2 

.  634 

0 . 

0085 

AGE  5 

1 

-6. 

437570 

5 

.07562283 

-1 

.268 

0  . 

2047 

AGE  7 

1 

36. 

105014 

5 

.  49491641 

6 

.571 

0 . 

0001 

AGE  8 

1 

71. 

045294 

16 

.  81835150 

4 

.224 

0 . 

0001 

GENDER 

1 

2  . 

971877 

3 

. 61219353 

0 

.  823 

0 . 

4107 

ADG9101 

1 

3. 

966048 

4 

. 65233094 

0 

.  852 

0 . 

3940 

ADG9102 

1 

10. 

965523 

3 

.  99122775 

2 

.747 

0 . 

0060 

ADG9103 

1 

25. 

251023 

6 

.27897228 

4 

.022 

0 . 

0001 

ADG9104 

1 

8  . 

967756 

6 

67887057 

1 

.  343 

0 . 

1794 

ADG9105 

1 

10. 

065122 

11 

31975270 

0 

.889 

0 . 

3739 

ADG9106 

1 

9. 

930652 

10 

6474  5^1 S 

•  \J  x   /  1  J  J1J 

0 

.  933 

0 . 

3510 

ADG9107 

1 

20. 

000071 

5 

.  47500547 

3 

.  653 

0  . 

0003 

ADG9108 

1 

-0. 

456351 

4 

.50640299 

-0 

.  101 

0. 

9193 

ADG9109 

1 

73. 

835322 

17 

.71892650 

4 

.  167 

0  . 

0001 

ADG9110 

1 

16. 

518308 

5 

.07666574 

3 

.254 

0. 

0011 

ADG9111 

1 

68. 

480961 

6 

.72296724 

10 

.  186 

0. 

0001 

ADG9112 

1 

35. 

069410 

13 

. 65467427 

2 

.  568 

0  . 

0102 

ADG9113 

i 

-15. 

526462 

20 

. 02340010 

-0 

.  775 

0. 

4381 

ADG9114 

1 

7  . 

546570 

5 

. 50880115 

1 

.  370 

0. 

1707 

ADG9115 

1 

20. 

131920 

50 

.  84399311 

0 

.  396 

0. 

6921 

ADG9116 

1 

9. 

307863 

14 

. 57476309 

0 

.  639 

0. 

5231 

ADG9117 

1 

8  . 

414987 

8 

.34288178 

1 

.009 

0. 

3132 

ADG9118 

1 

46. 

215669 

12 

.58575117 

3 

.  672 

0. 

0002 

ADG9119 

1 

34. 

928319 

22 

.  18150046 

1 

.575 

0. 

1154 

ADG912  0 

1 

3. 

678533 

7 

.33861811 

0 

.501 

0. 

6162 

ADG9121 

1 

4. 

360578 

5 

.23146786 

0 

.  834 

0. 

4046 

ADG9122 

1 

8. 

303727 

5 

.77936230 

1 

.  437 

0. 

1508 

ADG9123 

1 

21. 

001192 

9 

.03018394 

2 

.  326 

0. 

0201 

ADG9124 

1 

86. 

513640 

24 

.84682017 

3 

.482 

0. 

0005 

ADG9125 

1 

10. 

332244 

10 

. 16220913 

1 

.  017 

0. 

3093 

ADG9126 

1 

7  . 

994813 

4 

.70828892 

1 

.  698 

0. 

0895 

ADG9127 

1 

20. 

960792 

6 

.31153943 

3 

.321 

0. 

0009 

ADG9128 

1 

18. 

349240 

4 

.31198487 

4 

.255 

0. 

0001 

ADG9129 

1 

16. 

542755 

6 

.30242459 

2 

.  625 

0. 

0087 

ADG9130 

1 

28. 

003580 

14 

. 16914188 

1 

.  976 

0. 

0481 

ADG9131 

1 

3. 

650340 

4 

. 03652894 

0 

.904 

0. 

3658 

ADG9132 

1 

51. 

837348 

9 

. 18163479 

5 

.  646 

0. 

0001 

ADG9133 

1 

111. 

650208 

10 

.74858072 

10 

.387 

0. 

0001 

ADG9134 

1 

-10. 

178533 

32 

.  03707996 

-0 

.  318 

0. 

7507 

One-part  model  using  actual  dollars 
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DEPENDENT  VARIABLE:  Year-2  Charges 
INDEPENDENT  VARIABLES:    1991  ACGs 


HMO -A 


Source 

Model 
Error 
C  Total 


DF 


Sum  of 
Squares 


Mean 
Square 


50  39580746.165  791614.92330 
11954  435775331.96  36454.35268 
12004  475356078.12 


F  Value 
21.715 


Prob>F 
0. 0001 


Root  MSE 
Dep  Mean 
C.V. 


190. 93023 
72 . 59842 
262. 99504 


R-square 
Adj  R-sq 


0. 0833 
0. 0794 


Parameter  Estimates 


Parameter 


Standard 


T  for  HO: 


Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

|T| 

INTERCEP 

1 

29. 

543085 

5 . 

70004477 

5. 

183 

0 . 

0001 

ACG9 101 

1 

2. 

967331 

67  . 

74425877 

0. 

044 

0 . 

9651 

ACG9102 

1 

9. 

488165 

22 . 

09456763 

0. 

429 

0 . 

6676 

ACG9103 

1 

13. 

049833 

8 . 

67624973 

1. 

504 

0 . 

1326 

ACG9104 

1 

7. 

112096 

11 . 

80655625 

0. 

602 

0 . 

5469 

ACG9105 

1 

0. 

886610 

11 . 

83323599 

0. 

075 

0 . 

9403 

ACG9 106 

1 

-11. 

407669 

48  . 

07169181 

-0. 

237 

0 . 

8124 

ACG9107 

1 

-14. 

671873 

57  . 

84913632 

-0. 

254 

0. 

7998 

ACG9108 

1 

5. 

556508 

30 . 

35821246 

0. 

183 

0 . 

8548 

ACG9109 

1 

45. 

784574 

17  . 

05966057 

2. 

684 

0 . 

0073 

ACG9110 

1 

-6. 

404196 

110 . 

38089239 

-0. 

058 

0 . 

9537 

ACG9111 

1 

2. 

264824 

15 . 

44173906 

0. 

147 

0. 

8834 

ACG9112 

1 

63. 

370035 

28 . 

42736420 

2. 

229 

0. 

0258 

ACG9113 

1 

14. 

391439 

51. 

34562148 

0. 

280 

0. 

7793 

ACG9114 

1 

-0. 

188919 

36. 

52987377 

-0. 

005 

0. 

9959 

ACG9115 

0 

0 

0. 

00000000 

ACG9116 

1 

-7. 

964342 

13. 

67326247 

-o! 

582 

o! 

5603 

ACG9117 

1 

83. 

609089 

40. 

21768723 

2. 

079 

0. 

0376 

ACG9118 

1 

23. 

616361 

9. 

17999245 

2. 

573 

0. 

0101 

ACG9119 

1 

-0. 

579317 

40. 

21768723 

-0. 

014 

0. 

9885 

ACG9120 

1 

4. 

807294 

15. 

47958379 

0. 

311 

0. 

7561 

ACG9121 

1 

9  . 

400343 

10. 

32398776 

0. 

911 

0. 

3626 

ACG9122 

1 

9. 

749772 

32. 

77260204 

0. 

297 

0. 

7661 

ACG9123 

1 

13. 

064058 

19. 

48524626 

0. 

670 

0. 

5026 

ACG9124 

1 

17. 

302356 

18. 

54902085 

0. 

933 

0. 

3509 

ACG9125 

1 

-1. 

591696 

55. 

41076821 

-0. 

029 

0. 

9771 

ACG9126 

1 

-12. 

376419 

49. 

62641121 

-0. 

249 

0. 

8031 

ACG9127 

1 

-19. 

543085 

191. 

01529570 

-0. 

102 

0. 

9185 

ACG9128 

1 

32. 

403432 

14. 

62379736 

2. 

216 

0. 

0267 

ACG9129 

1 

23. 

268509 

28. 

72240307 

0. 

810 

0. 

4179 

ACG9130 

1 

26. 

948435 

15. 

67398671 

1. 

719 

0. 

0856 

ACG9131 

1 

2. 

005969 

17. 

05966057 

0. 

118 

0. 

9064 

ACG9132 

1 

52. 

013541 

10. 

96960911 

4. 

742 

0. 

0001 

ACG9133 

1 

1 . 

149222 

37. 

87586300 

0. 

030 

0. 

9758 

ACG9134 

1 

16. 

119516 

21. 

84160679 

0. 

738 

0. 

4605 

ACG9135 

1 

59. 

250475 

29. 

34277638 

2. 

019 

0. 

0435 

ACG9136 

1 

98. 

915754 

13. 

41935106 

7. 

371 

0. 

0001 

ACG9137 

1 

21. 

639785 

23. 

21208660 

0. 

932 

0. 

3512 

ACG9138 

1 

-1 . 

605665 

13. 

12106550 

-0. 

122 

0. 

9026 

ACG9139 

1 

14. 

118326 

18. 

99758148 

0. 

743 

0. 

4574 

ACG9140 

1 

93. 

644101 

14. 

68598694 

6. 

376 

0. 

0001 

ACG9141 

1 

77  . 

046507 

9. 

16801706 

8. 

404 

0. 

0001 

ACG9142 

1 

16. 

853256 

12  . 

62969362 

1 . 

334 

0. 

1821 

ACG9143 

1 

77  . 

150227 

9. 

17999245 

8. 

404 

0. 

0001 

One-part  model  using  actual  dollars 


149 
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ACG9144 

1 

109. 

116300 

11. 

,  07808903 

9  . 

850 

0. 

0001 

ACG9145 

1 

62. 

219826 

23. 

,36518987 

2  . 

663 

0. 

0078 

ACG9146 

1 

49. 

238204 

16. 

.  17909666 

3  . 

043 

0. 

0023 

ACG9147 

1 

41. 

302503 

23. 

,84499805 

1 . 

732 

0. 

0833 

ACG9148 

1 

124. 

305831 

13. 

.  44168650 

9. 

248 

0. 

0001 

ACG9149 

1 

163. 

083692 

9. 

,33952760 

17. 

462 

0. 

0001 

ACG9150 

1 

294. 

681991 

14. 

. 10491907 

20. 

892 

0. 

0001 

ACG9151 

1 

4. 

953925 

7. 

,78163912 

0. 

637 

0. 

5244 
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DEPENDENT  VARIABLE:    Year-2  Charges  HMO -A 

INDEPENDENT  VARIABLES:    Year-1  Chronic  Flag  A,    G,   A*G,    &  A*A 


Sum  of  Mean 
Source  DF  Squares  Square  F  Value  Prob>F 


Model 

5  27688671. 

773  5537734.3546 

148 . 430 

0.  00i 

Error 

11999  447667406 

.35  37308.72626 

C  Total 

12004  475356078 

.  12 

Root 

MSE 

193.15467 

R-square 

0. 0582 

Dep  Mean 

72.59842 

Adj  R-sq 

0. 0579 

C.V. 

266.05907 

Parameter 

Estimates 

Parameter 

Standard  T 

for  HO: 

Variable 

DF 

Estimate 

Error  Parameter=0 

Prob  > 

|T| 

INTERCEP 

1 

26.784543 

6. 09191471 

4.397 

0. 

0001 

AGE  91 

1 

-0. 993738 

0.37925760 

-2.620 

0. 

0088 

GENDER 

1 

13.236877 

6.98689872 

1.  895 

0. 

0582 

CROSS91 

1 

0.  092975 

0.20373311 

0.  456 

0. 

6481 

ASQ91 

1 

0. 041038 

0. 00573858 

7.151 

0. 

0001 

CHRONIC 

1 

59.025296 

4. 05450436 

14.558 

0. 

0001 

DEPENDENT  VARIABLE:   Year-2  Charges  HMO -A 

INDEPENDENT  VARIABLES:   Year-1  Chronic  Flag  &  A/G  Categories 


Sum  of  Mean 
Source  DF  Squares  Square  F  Value  Prob>F 


Model 

9  25484605. 

551  2831622. E 

(390  75.500 

0.001 

Error 

11995  449871472 

.57  37504.91643 

C  Total 

12004  475356078 

.  12 

Root 

MSE 

193 

. 66186 

R-square 

0. 0536 

Dep  Mean 

72 

.59842 

Adj  R-sq 

0. 0529 

C.V. 

266 

.75770 

Parameter 

Estimates 

Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

|T| 

INTERCEP 

1 

43. 

035728 

4.43930550 

9.694 

0. 

0001 

AGE1 

1 

-6. 

544049 

16.34311110 

-0.400 

0. 

6889 

AGE  2 

1 

-21. 

008290 

7.56604887 

-2.777 

0. 

0055 

AGE  3 

1 

-34. 

747010 

6.47568166 

-5. 366 

0. 

0001 

AGE  4 

1 

-22  . 

657419 

6.98334526 

-3.244 

0. 

0012 

AGE  5 

1 

1. 

255856 

5.12475923 

0.245 

0. 

8064 

AGE  7 

1 

41. 

377689 

5.55483327 

7.449 

0. 

0001 

AGE  8 

1 

99. 

918395 

16. 92684469 

5.  903 

0. 

0001 

GENDER 

1 

14. 

931666 

3.56128646 

4.  193 

0. 

0001 

CHRONIC 

1 

63. 

353455 

4. 04118267 

15. 677 

0. 

0001 
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DEPENDENT  VARIABLE:   Year-2  Charges 
INDEPENDENT  VARIABLES:   Year-1  A,   G,  A*G, 


&  A*  A 


HMO -A 


Source 

Model 
Error 
C  Total 


Root  MSE 
Dep  Mean 
C.V. 


Sum  of 
DF  Squares 

4  850.93945 
10092  12192.55921 
10096  13043.49866 

1.09915 
2.98749 
36.79189 


Mean 
Square 

212.73486 
1.20814 


R-square 
Adj  R-sq 


F  Value 
176. 084 


0. 0652 
0. 0649 


Prob>F 
0. 0001 


Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  >   | T | 

INTERCEP 

1 

2.766385 

0. 

03693823 

74. 892 

0. 0001 

AGE  91 

1 

-0. 006268 

0. 

00235608 

-2. 660 

0. 0078 

GENDER 

1 

-0. 074698 

0. 

04273454 

-1.748 

0. 0805 

CROSS 91 

1 

0. 010014 

0. 

00125371 

7  .  988 

0.0001 

ASQ91 

1 

0.000253 

0. 

00003603 

7.030 

0. 0001 

DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:   Year-1  Age  &  Gender  Categories 


Source 


DF 


Sum  of 
Squares 


Mean 
Square 


F  Value 


Prob>F 


Model 

8  825.610 

81         103.20135  85.211 

0.00 

Error 

10088  12217.887 

84  1.21113 

C  Total 

10096  13043.498 

66 

Root 

MSE 

1. 10051 

R-square 

0. 0633 

Dep  Mean 

2 . 98749 

Adj  R-sq 

0. 0626 

C.V. 

36. 83739 

Parameter 

Estimates 

Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

|T| 

INTERCEP 

1 

3. 009588 

0.02623926 

114. 698 

0. 

0001 

AGE1 

1 

0. 162514 

0. 09799173 

1.  658 

0. 

0973 

AGE  2 

1 

-0.251152 

0.04523782 

-5.552 

0. 

0001 

AGE  3 

1 

-0. 520170 

0.03921934 

-13.263 

0. 

0001 

AGE  4 

1 

-0.376412 

0.04221098 

-8.917 

0. 

0001 

AGE  5 

1 

-0.268079 

0.03194819 

-8 . 391 

0. 

0001 

AGE  7 

1 

0.261546 

0.03417309 

7.  654 

0. 

0001 

AGE  8 

1 

0.492363 

0. 10953341 

4.  495 

0. 

0001 

GENDER 

1 

0.219717 

0. 02192819 

10. 020 

0. 

0001 
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DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:      Year-1  ADGs,   A,    G,   A*G,   A* A 


Source 

Model 
Error 
C  Total 


DF 

38 
10058 
10096 


Root  MSE 
Dep  Mean 
C.V. 


Sum  of 
Squares 

2203. 85189 
10839. 64677 
13043. 49866 


Mean 
Square 

57 . 99610 
1. 07771 


1.03813 
2.98749 
34.74922 


R-square 
Adj  R-sq 


F  Value 
53 .814 


HMO -A 

Prob>F 
0. 0001 


0. 1690 
0. 1658 


Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

IT  j 

INTERCEP 

1 

2.315748 

0. 

04108139 

D  D  . 

•370 

0. 

0001 

AGE  91 

1 

0.001518 

0. 

00236491 

U  . 

0. 

5210 

GENDER 

1 

-0. 104279 

0. 

04068656 

 Q 

SCO 

0  0  0 

0. 

0104 

CROSS 91 

1 

0. 006880 

0. 

00119825 

O  . 

1  AO 

0. 

0001 

ASQ91 

1 

0. 000114 

0. 

00003580 

-5 

O  . 

ISO 

0. 

0014 

ADG9101 

1 

0.155385 

0. 

02736253 

D  . 

0  /  y 

0. 

0001 

ADG9102 

1 

0. 165531 

0. 

02371793 

c 
0  . 

Q ^  Q 

0. 

0001 

ADG9103 

1 

0.079026 

0. 

03766175 

0 

z  . 

nQQ 
uyo 

0. 

0359 

ADG9104 

1 

0.061372 

0. 

03933822 

1 

J_  . 

sen 

JDU 

0. 

1188 

ADG9105 

1 

0. 151700 

0. 

06621657 

9 

0. 

0220 

ADG9106 

1 

0. 154597 

0. 

06333697 

2. 

441 

0. 

0147 

ADG9107 

1 

0. 178209 

0. 

03252138 

5. 

480 

0. 

0001 

ADG9108 

1 

0. 157696 

0. 

02646431 

5. 

959 

0. 

0001 

ADG9109 

1 

0. 004560 

0. 

11208629 

0 

041 

0. 

9675 

ADG9110 

1 

0. 108413 

0. 

03035159 

3 

572 

0. 

0004 

ADG9111 

1 

0.241775 

0. 

04107120 

5 

887 

0  . 

0001 

ADG9112 

1 

0. 102730 

0. 

08084284 

1 

271 

0. 

2039 

ADG9113 

1 

-0. 060083 

0. 

11671650 

-0 

515 

0. 

6067 

ADG9114 

1 

0. 050305 

0. 

03238435 

1 

553 

0. 

1204 

ADG9115 

1 

-0.237198 

0. 

30295359 

-0 

783 

0. 

4337 

ADG9116 

1 

0. 046113 

0. 

08738997 

0 

528 

0. 

5977 

ADG9117 

1 

0. 121463 

0. 

04876852 

2 

491 

0. 

0128 

ADG9118 

1 

0.297470 

0. 

07549010 

3 

941 

0. 

0001 

ADG9119 

1 

0.211616 

0. 

13628475 

1 

553 

0. 

1205 

ADG9120 

1 

0.098352 

0. 

04311352 

2 

281 

0. 

0226 

ADG9121 

1 

0. 145533 

0. 

03094121 

4 

704 

0. 

0001 

ADG9122 

1 

0. 052087 

0. 

03422482 

1 

522 

0. 

1281 

ADG9123 

1 

0.057261 

0. 

05492051 

1 

043 

0. 

2972 

ADG9124 

1 

0. 007407 

0. 

15441269 

0 

048 

0. 

9617 

ADG9125 

1 

0. 145885 

0. 

06028578 

2 

420 

0. 

0155 

ADG9126 

1 

0. 118197 

0. 

02779422 

4 

253 

0. 

0001 

ADG9127 

1 

0.226859 

0. 

03724937 

6 

090 

0. 

0001 

ADG9128 

1 

0.169142 

0. 

02532378 

6 

679 

0. 

0001 

ADG9129 

1 

0. 134792 

0. 

03744665 

3 

600 

0. 

0003 

ADG9130 

1 

0. 133430 

0. 

08379048 

1 

.592 

0. 

1113 

ADG9131 

1 

0. 105148 

0. 

02363463 

4 

449 

0. 

0001 

ADG9132 

1 

0. 127956 

0. 

05591221 

2 

.289 

0. 

0221 

ADG9133 

1 

0. 088601 

0. 

07419463 

1 

.  194 

0. 

2324 

ADG9134 

1 

0.277277 

0. 

18160780 

1 

.527 

0. 

1268 
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DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:      Year-1  ADGs   &  A/G  Categories 


Source 

Model 
Error 
C  Total 


DF 


Sum  of 
Squares 


42  2202.51599 
10054  10840.98267 
10096  13043.49866 


Mean 
Square 

52 . 44086 
1. 07828 


F  Value 
48. 634 


HMO -A 

Prob>F 
0. 0001 


Root  MSE  1. 03840 

Dep  Mean  2.98749 
C.V.  34.75827 

Parameter  Estimates 


R-square  0.1689 
Adj   R-sq  0.1654 


Parameter  Standard        T  for  HO: 


Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

|T| 

INTERCEP 

1 

2 . 

623964 

0. 

02745892 

95 . 

560 

0. 

0001 

AGE1 

1 

-0 . 

013929 

0. 

09561139 

-0 . 

146 

0. 

8842 

AGE  2 

1 

-0 . 

278468 

0. 

04619236 

-6 . 

028 

0 . 

0001 

AGE  3 

1 

-0 . 

452363 

0. 

03896172 

-11 . 

610 

0 . 

0001 

AGE  4 

1 

-0 . 

276936 

0. 

04094248 

-6 . 

764 

0 . 

0001 

AGE  5 

1 

-0 . 

231459 

0. 

03065677 

-7  . 

550 

0 . 

0001 

AGE  7 

1 

0 . 

175117 

0. 

03302041 

5 . 

303 

0. 

0001 

AGE  8 

1 

0 . 

314224 

0. 

10509520 

2  . 

990 

0  . 

0028 

GENDER 

1 

0. 

097406 

0. 

02160533 

4 . 

508 

0  . 

0001 

ADG9101 

1 

0 . 

147981 

0. 

02743645 

5 . 

394 

0 . 

0001 

ADG9102 

1 

0 . 

155006 

0 . 

02376315 

6. 

523 

0 . 

0001 

ADG9103 

1 

0 . 

086618 

0. 

03770897 

2  . 

297 

0 . 

0216 

ADG9104 

1 

0 . 

059423 

0 . 

03937739 

1 . 

509 

0 . 

1313 

ADG9105 

1 

0 . 

172425 

0 . 

06628025 

2  . 

601 

0 . 

0093 

ADG9106 

1 

0. 

162876 

0  . 

06341819 

2  . 

568 

0 . 

0102 

ADG9107 

1 

0. 

172633 

0  . 

03254438 

5. 

305 

0 . 

0001 

ADG9108 

1 

0. 

143679 

0. 

02664132 

5. 

393 

0. 

0001 

ADG9109 

1 

0. 

037370 

0. 

11202973 

0. 

334 

0. 

7387 

ADG9110 

1 

0. 

129522 

0. 

03024402 

4  . 

283 

0. 

0001 

ADG9111 

1 

0. 

252715 

0. 

04094242 

6. 

172 

0. 

0001 

ADG9112 

1 

0. 

104374 

0. 

08087517 

1. 

291 

0. 

1969 

ADG9113 

1 

-0. 

072869 

0. 

11682320 

-0. 

624 

0. 

5328 

ADG9114 

1 

0. 

059673 

0. 

03260506 

1. 

830 

0. 

0673 

ADG9115 

1 

-0. 

188743 

0. 

30309776 

-0. 

623 

0. 

5335 

ADG9116 

1 

0. 

053626 

0. 

08743642 

0. 

613 

0. 

5397 

ADG9117 

1 

0. 

133041 

0. 

04881383 

2. 

725 

0. 

0064 

ADG9118 

1 

0. 

330528 

0. 

07546019 

4. 

380 

0. 

0001 

ADG9119 

1 

0. 

226931 

0. 

13635534 

1. 

664 

0. 

0961 

ADG9120 

1 

0. 

095426 

0. 

04316318 

2. 

211 

0. 

0271 

ADG9121 

1 

0. 

148641 

0. 

03106097 

4. 

785 

0. 

0001 

ADG9122 

1 

0. 

052589 

0. 

03422243 

1. 

537 

0. 

1244 

ADG9123 

1 

0. 

058255 

0. 

05493698 

1 . 

060 

0. 

2890 

ADG9124 

1 

0. 

033314 

0. 

15446916 

0. 

216 

0. 

8293 

ADG9125 

1 

0. 

151285 

0. 

06026887 

2. 

510 

0. 

0121 

ADG9126 

1 

0. 

131423 

0. 

02775801 

4. 

735 

0. 

0001 

ADG9127 

1 

0. 

230884 

0. 

03726743 

6. 

195 

0. 

0001 

ADG9128 

1 

0. 

172554 

0. 

02533478 

6. 

811 

0. 

0001 

ADG9129 

1 

0. 

136401 

0. 

03744746 

3. 

642 

0. 

0003 

ADG9130 

1 

0. 

139955 

0. 

08379308 

1. 

670 

0. 

0949 

ADG9131 

1 

0. 

087717 

0. 

02387975 

3. 

673 

0. 

0002 

ADG9132 

1 

0. 

135043 

0. 

05593547 

2  . 

414 

0. 

0158 

ADG9133 

1 

0. 

121507 

0. 

07484051 

1. 

624 

0. 

1045 

ADG9134 

1 

0. 

257915 

0. 

18167640 

1. 

420 

0. 

1557 
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DEPENDENT  VARIABLE:  Year-2  Charges 
INDEPENDENT  VARIABLES:    1991  ACGs 


Source 

Model 
Error 
C  Total 


DF 


Sum  of 
Squares 


Root  MSE 
Dep  Mean 
C.V. 


50  1811.73892 
10046  11231.75973 
10096  13043.49866 

1.05737  R-square 
2.98749  Adj  R-sq 

35.39327 


Mean 
Square 

36.23478 
1. 11803 


F  Value 
32 . 409 


HMO -A 

Prob>F 
0.0001 


0. 1389 
0. 1346 


Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

|T| 

INTERCEP 

1 

2.615441 

0. 

04533440 

COT 

.  o  y  z. 

0. 

0001 

ACG9101 

1 

0.764475 

0. 

40221165 

j. 

QD1 

•  jUl 

0. 

0574 

ACG9102 

1 

-0. 076238 

0. 

12948424 

Sft  9 

■  JO  J 

0. 

5560 

ACG9103 

1 

0. 119644 

0. 

06004896 

1 

J_ 

■     y  tL 

0. 

0464 

ACG9104 

1 

0. 094288 

0. 

07547853 

2_ 

.249 

0. 

2116 

ACG9105 

1 

-0. 005883 

0. 

07620365 

-0 

.  077 

0. 

9385 

ACG9106 

1 

-0. 173936 

0. 

28620746 

-0 

.  608 

0. 

5434 

ACG9107 

1 

-0. 108705 

0. 

32201642 

-0 

.  338 

0. 

7357 

ACG9108 

1 

-0. 188050 

0. 

18196615 

—  1 

.  033 

0. 

3014 

ACG9109 

1 

0. 315793 

0. 

10627864 

2 

.  971 

0. 

0030 

ACG9110 

1 

0.309149 

0. 

61215430 

o 

.505 

0. 

6136 

ACG9 111 

1 

0. 010772 

0. 

09802798 

o 

.  110 

0. 

9125 

ACG9112 

1 

0.238808 

0. 

17322250 

1 

.  37  9 

0. 

1680 

ACG9113 

1 

-0. 123805 

0. 

32201642 

-0 

.384 

0. 

7006 

ACG9114 

1 

0. 043230 

0. 

23514862 

o 

.184 

0. 

8541 

ACG9115 

0 

0 

. 

. 

ACG9116 

1 

-0. 069807 

0. 

08776012 

-0 

.795 

0. 

4264 

ACG9117 

1 

0. 137336 

0. 

27675033 

0 

.  496 

0. 

6197 

ACG9118 

1 

0. 302556 

0. 

06170584 

4 

.  903 

0. 

0001 

ACG9119 

1 

0. 531351 

0. 

22994541 

2 

.311 

0. 

0209 

ACG9120 

1 

0.252092 

0. 

09332827 

2 

.701 

0. 

0069 

ACG9121 

1 

0.252559 

0. 

06737749 

3 

.748 

0. 

0002 

ACG9122 

1 

0.349989 

0. 

19233757 

1 

.  820 

0. 

0688 

ACG9123 

1 

0.275708 

0. 

11864682 

2 

.  324 

0. 

0202 

ACG9124 

1 

0. 137298 

0. 

11096122 

1 

.237 

0. 

2160 

ACG9125 

1 

0. 416103 

0. 

32201642 

1 

.292 

0. 

1963 

ACG9126 

1 

-0. 118130 

0. 

30858488 

-0 

.383 

0. 

7019 

ACG9127 

1 

-0. 312856 

1. 

05834221 

-0 

.296 

0. 

7675 

ACG9128 

1 

0.262086 

0. 

09092031 

2 

.  883 

0. 

0040 

ACG912  9 

1 

0.771699 

0. 

16572583 

4 

.  656 

0. 

0001 

ACG9130 

1 

0. 421132 

0. 

09441560 

4 

.  460 

0. 

0001 

ACG9131 

1 

0.227864 

0. 

10197420 

2 

.235 

0. 

0255 

ACG9132 

1 

0. 635599 

0. 

07166023 

8 

.  870 

0. 

0001 

ACG9133 

1 

0. 500748 

0. 

21627882 

2 

.315 

0. 

0206 

ACG9134 

1 

0. 402935 

0. 

12801939 

3 

.147 

0. 

0017 

ACG9135 

1 

0. 696647 

0. 

17741795 

3 

.  927 

0. 

0001 

ACG9136 

1 

0. 941986 

0. 

08517841 

11 

.  059 

0. 

0001 

ACG9137 

1 

0.720163 

0. 

13876497 

5 

.  190 

0. 

0001 

ACG9138 

1 

0. 003415 

0. 

08303721 

0 

.  041 

0. 

9672 

ACG9139 

1 

0. 190962 

0. 

11810663 

1 

.617 

0. 

1059 

ACG9140 

1 

0.317302 

0. 

09802798 

3 

.237 

0. 

0012 

ACG9141 

1 

0. 598903 

0. 

06237835 

9 

.  601 

0. 

0001 

ACG9142 

1 

0. 314198 

0. 

07901011 

3 

.  977 

0. 

0001 

ACG9143 

1 

0.574898 

0. 

06321580 

9 

.  094 

0. 

0001 
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ACG9144 

1 

0. 

828808 

0. 

07290253 

11 . 

369 

0. 

0001 

ACG9145 

1 

i . 

083911 

0. 

14277121 

7. 

592 

0. 

0001 

ACG9146 

1 

0. 

590518 

0. 

09992257 

5. 

910 

0. 

0001 

ACG9147 

1 

0. 

538555 

0. 

14277121 

3  . 

772 

0. 

0002 

ACG914  8 

1 

l. 

006049 

0. 

08893134 

11 . 

313 

0. 

0001 

ACG9149 

1 

l. 

346176 

0. 

06379406 

21. 

102 

0. 

0001 

ACG9150 

1 

l. 

808783 

0. 

09397428 

19. 

248 

0. 

0001 

ACG9151 

1 

-0. 

064334 

0. 

05435689 

-1. 

184 

0. 

2366 
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DEPENDENT  VARIABLE:    Year-2  Charges  HMO -A 

INDEPENDENT  VARIABLES:   Year-1  Chronic  Flag  A,   G,   A*G,    &  A*A 


Sum  of  Mean 
Source  DF  Squares  Square  F  Value  Prob>F 


Model 

5  1217. 

44303         243. 48£ 

361  207.765 

0 .  00 

Error 

10091  11826. 

05563  1.17194 

C  Total 

10096  13043. 

49866 

Root 

MSE 

1.08256 

R-square 

0. 0933 

Dep  Mean 

2.98749 

Adj  R-sq 

0. 0929 

C.V. 

36.23649 

Parameter 

Estimates 

Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob 

>  |T| 

INTERCEP 

1 

2.714099 

0.03650056 

74. 358 

0 

.0001 

AGE  91 

1 

-0.006534 

0.  00232057 

-2. 816 

0 

.0049 

GENDER 

1 

-0. 082076 

0. 04209151 

-1. 950 

0 

.  0512 

CROSS91 

1 

0. 008785 

0. 00123674 

7. 103 

0 

.0001 

ASQ91 

1 

0. 000203 

0. 00003560 

5.  694 

0 

.  0001 

CHRONIC 

1 

0. 437861 

0.02475999 

17.684 

0 

.  0001 

DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:   Year-1  Chronic  Flag  &  A/G  Categories 


Source 

Model 
Error 
C  Total 


DF 

9 

10087 
10096 


Sum  of 
Squares 

1227 . 98622 
11815. 51244 
13043. 49866 


Mean 
Square 

136.44291 
1. 17136 


F  Value 
116. 482 


Prob>F 
0.0001 


Root  MSE 
Dep  Mean 
C.V. 


1.08229 
2 . 98749 
36.22752 


R-square 
Adj  R-sq 


0. 0941 
0. 0933 


Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

|T| 

INTERCEP 

1 

2.856561 

0. 

02709354 

105.433 

0. 

0001 

AGE1 

1 

0.268857 

0. 

09654006 

2.785 

0. 

0054 

AGE  2 

1 

-0. 147025 

0. 

04484220 

-3.279 

0. 

0010 

AGE  3 

1 

-0. 415082 

0. 

03898456 

-10. 647 

0. 

0001 

AGE  4 

1 

-0.272861 

0. 

04188643 

-6.514 

0. 

0001 

AGE  5 

1 

-0.213387 

0. 

03155753 

-6.762 

0. 

0001 

AGE  7 

1 

0. 179497 

0. 

03389764 

5.295 

0. 

0001 

AGE  8 

1 

0.331367 

0. 

10806967 

3.066 

0. 

0022 

GENDER 

1 

0. 174286 

0. 

02170402 

8.030 

0. 

0001 

CHRONIC 

1 

0. 455891 

0. 

02459747 

18.534 

0. 

0001 
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DEPENDENT  VARIABLE:    Year-2  Charges 
INDEPENDENT  VARIABLES:   Year-1  A,   G,  A*G, 


&  A*A 


HMO -A 


Source 


DF 


Sum  of 
Squares 


Mean 
Square 


F  Value 


Prob>F 


Model 

4  65. 

38640 

16.34660  17.669 

O.OOi 

Error 

754  697. 

56782 

0. 92516 

C  Total 

758  762. 

95422 

Root 

MSE 

0. 96185 

R 

-square 

0. 0857 

Dep  Mean 

5. 94880 

Adj  R-sq 

0. 0809 

C.V. 

16. 16883 

Parameter 

Estimates 

Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob 

>  |T| 

INTERCEP 

1 

5.295362 

0. 

16108118 

32.874 

0 

.  0001 

AGE  91 

1 

0. 007575 

0. 

00790842 

0.  958 

0 

.  3385 

GENDER 

1 

0.273517 

0. 

17671759 

1.548 

0 

.  1221 

CR0SS91 

1 

-0. 000719 

0. 

00430324 

-0. 167 

0 

.  8674 

ASQ91 

1 

0.000139 

0. 

00010222 

1.  360 

0 

.  1743 

DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:   Year-1  Age  &  Gender  Categories 


Sum  of  Mean 
Source  DF  Squares  Square  F  Value  Prob>F 


Model 

8 

55. 

90290 

6.9: 

3786  7.412 

O.OOi 

Error 

750 

707. 

05132 

0.9 

4274 

C  Total 

758 

762. 

95422 

Root 

MSE 

0 

.  97095 

R 

-square 

0. 0733 

Dep  Mean 

5 

. 94880 

Adj  R-sq 

0. 0634 

C.V. 

16 

.32172 

Parameter 

Estimates 

Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

|T| 

INTERCEP 

1 

5. 

774202 

0. 

09104033 

63. 425 

0. 

0001 

AGE1 

1 

0. 

047148 

0. 

37547400 

0.  126 

0. 

9001 

AGE  2 

1 

-0. 

547795 

0. 

21049917 

-2.602 

0. 

0094 

AGE  3 

1 

-0. 

463240 

0. 

22144386 

-2 . 092 

0. 

0368 

AGE  4 

1 

-0. 

246377 

0. 

18401421 

-1.339 

0. 

1810 

AGE  5 

1 

-0. 

119538 

0. 

09860868 

-1.212 

0. 

2258 

AGE  7 

1 

0. 

380518 

0. 

10277277 

3.  703 

0. 

0002 

AGE  8 

1 

0. 

509900 

0. 

23286821 

2.190 

0. 

0289 

GENDER 

1 

0. 

224308 

0. 

07707894 

2  .  910 

0. 

0037 
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DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:      Year-1  ADGs ,   A,   G,   A*G,   A* A 


Source 

Model 
Error 
C  Total 


Root  MSE 
Dep  Mean 
C.V. 


DF 

38 
720 
758 


Sum  of 
Squares 

93. 64041 
669.31381 
762.95422 


Mean 
Square 

2.46422 
0. 92960 


F  Value 
2  .  651 


HMO -A 

Prob>F 
0.0001 


0.96416 
5. 94880 
16.20763 


R- square 
Adj  R-sq 


0. 1227 
0.0764 


Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

|T| 

INTERCEP 

1 

5. 436880 

0. 

18643840 

29 . 

162 

0. 

0001 

AGE  91 

1 

0.000442 

0. 

00878014 

0 . 

050 

0. 

9599 

GENDER 

1 

0.062039 

0. 

19174149 

0 . 

324 

0. 

7464 

CROSS 91 

1 

0.002951 

0. 

00453659 

0 . 

651 

0. 

5155 

ASQ91 

1 

0.  000189 

0. 

00011554 

1 . 

636 

0. 

1022 

ADG9101 

1 

-0. 002296 

0. 

08741624 

-0 . 

026 

0. 

9791 

ADG9102 

1 

-0. 081325 

0. 

07687045 

-1 . 

058 

0. 

2904 

ADG9103 

1 

0. 095285 

0. 

09485164 

1 . 

005 

0. 

3154 

ADG9104 

1 

0. 057739 

0. 

12299833 

0 . 

469 

0. 

6389 

ADG9105 

1 

0.232870 

0. 

21757917 

1 . 

070 

0. 

2849 

ADG9106 

1 

-0. 017748 

0. 

17443375 

-0. 

102 

0. 

9190 

ADG9107 

1 

-0. 040342 

0. 

08767777 

-0  . 

460 

0. 

6456 

ADG9108 

1 

-0. 078865 

0. 

08868388 

-0. 

889 

0 . 

3742 

ADG9109 

1 

0.001107 

0. 

20081243 

0. 

006 

0. 

9956 

ADG9110 

1 

-0. 013192 

0. 

08833275 

-0. 

149 

0. 

8813 

ADG9111 

1 

0. 148083 

0. 

09893566 

1. 

497 

0. 

1349 

ADG9112 

1 

0. 023201 

0. 

22151567 

0. 

105 

0. 

9166 

ADG9113 

1 

-0.404387 

0. 

37475381 

-1. 

079 

0. 

2809 

ADG9114 

1 

-0. 025932 

0. 

10362850 

-0. 

250 

0. 

8025 

ADG9115 

1 

0. 801678 

0. 

69283948 

1. 

157 

0. 

2476 

ADG9116 

1 

0. 170014 

0. 

22042029 

0. 

771 

0. 

4408 

ADG9117 

1 

-0. 024474 

0. 

14901015 

-0. 

164 

0. 

8696 

ADG9118 

1 

-0. 168381 

0. 

21763997 

-0. 

774 

0. 

4394 

ADG9119 

1 

0.267053 

0. 

29192917 

0. 

915 

0. 

3606 

ADG9120 

1 

0.035462 

0. 

13757994 

0. 

258 

0. 

7967 

ADG9121 

1 

-0. 141922 

0. 

09799940 

-1. 

448 

0. 

1480 

ADG9122 

1 

0. 093560 

0. 

10761635 

0. 

869 

0. 

3849 

ADG9123 

1 

-0. 108343 

0. 

13372885 

-0. 

810 

0. 

4181 

ADG9124 

1 

0. 154294 

0. 

32263881 

0. 

478 

0. 

6326 

ADG9125 

1 

0.048902 

0. 

16488475 

0. 

297 

0. 

7669 

ADG9126 

1 

-0. 045743 

0. 

08422345 

-0. 

543 

0. 

5872 

ADG9127 

1 

0. 180710 

0. 

10649150 

1. 

697 

0. 

0901 

ADG9128 

1 

0.082809 

0. 

08183004 

1. 

012 

0. 

3119 

ADG9129 

1 

0. 021777 

0. 

10387704 

0. 

210 

0. 

8340 

ADG9130 

1 

-0. 047929 

0. 

23195631 

-0. 

207 

0. 

8364 

ADG9131 

1 

-0.006832 

0. 

08457914 

-0. 

081 

0. 

9356 

ADG9132 

1 

0. 178005 

0. 

12776091 

1. 

393 

0. 

1640 

ADG9133 

1 

0.250731 

0. 

11112174 

2. 

256 

0. 

0243 

ADG9134 

1 

-0. 179208 

0. 

69997793 

-0. 

256 

0. 

7980 
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DEPENDENT  VARIABLE:    Year-2  Charges 
INDEPENDENT  VARIABLES:      Year-1  ADGs 


&  A/G  Categories 


Source 

Model 
Error 
C  Total 


DF 

42 
716 
758 


Sum  of 
Squares 

86. 83411 
676. 12011 
762 . 95422 


Mean 
Square 

2 . 06748 
0. 94430 


F  Value 
2.189 


HMO -A 

Prob>F 
0 . 0001 


Root  MSE               0.97175  R-square  0.1138 

Dep  Mean  5.94880  Adj  R-sq  0.0618 
C.V.  16.33527 

Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  > 

1  T  | 

INTERCEP 

1 

5 .745007 

0 . 

10732225 

53.530 

0 . 

0001 

AGE1 

1 

0 . 153209 

0 . 

39147097 

0 .  391 

0 . 

6956 

AGE  2 

1 

-0 . 414087 

0  . 

22475628 

-1 . 842 

0 . 

0658 

AGE  3 

1 

-0 . 329811 

0  . 

23038392 

-1 . 432 

0 . 

1527 

AGE  4 

1 

-0 . 205367 

0 . 

19600044 

-1.048 

0 . 

2951 

AGE  5 

1 

-0 . 156195 

0 . 

10972340 

-1 . 424 

0 . 

1550 

AGE  7 

1 

0 . 317018 

0  . 

10767507 

2  .  944 

0 . 

0033 

AGE  8 

1 

0 . 287517 

0 . 

25972121 

1.107 

0 . 

2687 

GENDER 

1 

0 . 157519 

0 . 

08478821 

1.858 

0 . 

0636 

ADG9101 

1 

-0 . 015511 

0 . 

08860017 

-0.175 

0 . 

8611 

ADG9102 

1 

-0.095093 

0  . 

07743598 

-1.228 

0 . 

2198 

ADG9103 

1 

0.103882 

0 . 

09593773 

1.083 

0 . 

2793 

ADG9104 

1 

0 .  039720 

0 . 

12  48  47  8  6 

0.318 

0 . 

~7  C  A  C 

7  5  05 

ADG9105 

1 

0 . 225205 

0 . 

21995254 

1 .  024 

0 . 

3062 

ADG9106 

1 

-0 . 056419 

0 . 

17714592 

-0.318 

0 . 

7502 

ADG9107 

1 

-0.050109 

0 . 

08833808 

-0.567 

0  . 

5707 

ADG9108 

1 

-0 . 086248 

0 . 

08950350 

-0.964 

0 . 

3356 

ADG9109 

1 

0.061001 

0. 

20343793 

0  .  300 

0. 

7644 

ADG9110 

1 

0. 014441 

0. 

08956257 

0.  161 

0. 

8719 

ADG9111 

1 

0. 178976 

0. 

09943355 

1.  800 

0. 

0723 

ADG9112 

1 

0. 034939 

0. 

22357588 

0.  156 

0. 

8759 

ADG9113 

1 

-0.359502 

0. 

37850439 

-0. 950 

0. 

3425 

ADG9114 

1 

-0. 030124 

0. 

10521835 

-0.286 

0. 

7747 

ADG9115 

1 

0. 865700 

0. 

69831458 

1.240 

0. 

2155 

ADG9116 

1 

0. 156061 

0. 

22215681 

0.702 

0. 

4826 

ADG9117 

1 

-0. 009785 

0. 

15000400 

-0. 065 

0. 

9480 

ADG9118 

1 

0. 081555 

0. 

21296352 

0.  383 

0. 

7019 

ADG9119 

1 

0.253346 

0. 

29438992 

0.  861 

0. 

3898 

ADG9120 

1 

0. 030558 

0. 

13900643 

0.220 

0. 

8261 

ADG9121 

1 

-0 . 148914 

0. 

09997234 

-1.490 

0. 

1368 

ADG9122 

1 

0. 100654 

0. 

10939785 

0.  920 

0. 

3578 

ADG9123 

1 

-0. 143967 

0. 

13574366 

-1. 061 

0. 

2892 

ADG9124 

1 

0. 182302 

0. 

32612867 

0.559 

0. 

5763 

ADG9125 

1 

0. 020644 

0. 

16630483 

0.  124 

0. 

9012 

ADG9126 

1 

-0. 038239 

0. 

08515280 

-0. 449 

0. 

6535 

ADG9127 

1 

0.170592 

0. 

10759744 

1.585 

0. 

1133 

ADG9128 

1 

0. 086660 

0. 

08258203 

1.049 

0. 

2944 

ADG9129 

1 

0.019744 

0. 

10490046 

0.188 

0. 

8508 

ADG9130 

1 

-0. 037826 

0. 

23406319 

-0. 162 

0. 

8717 

ADG9131 

1 

-0. 011202 

0. 

08631224 

-0. 130 

0. 

8968 

ADG9132 

1 

0. 174135 

0. 

12904912 

1.349 

0. 

1776 

ADG9133 

1 

0 .240692 

0. 

.  11562845 

2.  082 

0. 

0377 

ADG9134 

1 

-0.247828 

0. 

70614160 

-0.351 

0. 

7257 

I 


Four-part  model  using  the  log  of  dollars  —  Inpatient  Users 


160 


DEPENDENT  VARIABLE:  Year-2  Charges 
INDEPENDENT  VARIABLES:    1991  ACGs 


Source 

Model 
Error 
C  Total 


Root  MSE 
Dep  Mean 
C.V. 


DF 

43 
715 
758 


Sum  of 
Squares 

101. 81027 
661. 14395 
762.95422 


Mean 
Square 

2.36768 
0. 92468 


F  Value 
2.561 


0. 96160 
5.94880 
16. 16464 


R-square 
Adj  R-sq 


0. 1334 
0.0813 


HMO -A 

Prob>F 
0. 0001 


Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  >   | T | 

INTERCEP 

1 

5 . 900390 

0 . 

17856486 

33 . 

043 

0 . 0001 

ACG9102 

1 

0 .291259 

0 . 

70301055 

0 . 

414 

0. 6788 

ACG9103 

1 

-0 . 334073 

0 . 

24842241 

-1 . 

345 

0.1791 

ACG9104 

1 

-0 . 173529 

0 . 

36691590 

-0. 

473 

0 . 6364 

ACG9105 

1 

0 . 100471 

0 . 

38401825 

0. 

262 

0 . 7937 

ACG9108 

1 

-0 . 543756 

0 . 

70301055 

-0. 

773 

0 . 4395 

ACG9109 

1 

0 . 067477 

0 . 

38401825 

0. 

176 

0 . 8606 

ACG9111 

1 

0  .240517 

0 . 

58319039 

0 . 

412 

0 . 6802 

ACG9112 

1 

0 . 679868 

0 . 

58319039 

1 . 

166 

0.2441 

ACG9113 

1 

-1 . 411396 

0 . 

70301055 

-2  . 

008 

0.0451 

ACG9114 

1 

0 . 073843 

0 . 

97804001 

0 . 

076 

0.9398 

ACG9116 

1 

-1 . 275219 

0 . 

51288851 

-2  . 

486 

0.0131 

ACG9117 

1 

-0 . 043476 

0 . 

43127511 

-0. 

101 

0.9197 

ACG9118 

1 

-0 . 101040 

0 . 

25041560 

-0. 

403 

0 . 6867 

ACG912  0 

1 

0 . 181424 

0 . 

58319039 

0. 

311 

0.7558 

ACGy  12 1 

1 

-0. 285287 

0 . 

36691590 

-0. 

778 

0 .  4  371 

ACG9122 

1 

-0. 110685 

0. 

97804001 

-0. 

113 

0. 9099 

ACG9123 

1 

-0.269964 

0. 

51288851 

-0. 

526 

0.5988 

ACG9124 

1 

0. 802837 

0. 

70301055 

1. 

142 

0.2538 

ACG9125 

1 

-2 . 879965 

0. 

97804001 

-2. 

945 

0.0033 

ACG9126 

1 

-2.813143 

0. 

97804001 

-2. 

876 

0. 0041 

ACG9128 

1 

0. 079750 

0. 

34050982 

0. 

234 

0.8149 

ACG9129 

1 

0. 127888 

0. 

97804001 

0. 

131 

0.8960 

ACG9130 

1 

0. 196870 

0. 

43127511 

0. 

456 

0.6482 

ACG9131 

1 

-0. 146747 

0. 

70301055 

-0. 

209 

0.8347 

ACG9132 

1 

-0. 111965 

0. 

25252884 

-0. 

443 

0. 6576 

ACG9134 

1 

-1.888171 

0. 

58319039 

-3. 

238 

0. 0013 

ACG9135 

1 

0.447544 

0. 

70301055 

0. 

637 

0. 5246 

ACG9136 

1 

0. 145461 

0. 

26243567 

0. 

554 

0.5796 

ACG9137 

1 

-1. 343907 

0. 

51288851 

-2. 

620 

0. 0090 

ACG9138 

1 

-0. 658434 

0. 

38401825 

-1. 

715 

0.0869 

ACG9139 

1 

0.297298 

0. 

51288851  . 

0. 

580 

0.5623 

ACG9140 

1 

-0.031829 

0. 

23106173 

-0. 

138 

0. 8905 

ACG9141 

1 

0.501544 

0. 

22529173 

2. 

226 

0. 0263 

ACG9142 

1 

-0. 581730 

0. 

34050982 

-1. 

708 

0.0880 

ACG9143 

1 

0. 102477 

0. 

20451936 

0. 

501 

0.6165 

ACG9144 

1 

0. 139706 

0. 

22616257 

0. 

618 

0.5370 

ACG9145 

1 

-0.449130 

0. 

36691590 

-1. 

224 

0.2213 

ACG9146 

1 

-0. 017180 

0. 

33006335 

-0. 

052 

0.9585 

ACG9147 

1 

-0.661860 

0. 

46564018 

-1. 

421 

0. 1556 

ACG9148 

1 

-0.035878 

0. 

22529173 

-0. 

159 

0. 8735 

ACG9149 

1 

0.273133 

0. 

20351452 

1. 

342 

0. 1800 

ACG9150 

1 

0. 532280 

0. 

22210840 

2. 

396 

0. 0168 

ACG9151 

1 

-0. 083100 

0. 

23993932 

-0. 

346 

0.7292 
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DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:    Year-1  Chronic  Flag  A,    G,   A*G,    &  A*A 


Source 

Model 
Error 
C  Total 


DF 
5 

753 
758 


Sum  of 
Squares 

67 . 55136 
695. 40286 
762 . 95422 


Mean 
Square 

13.51027 
0. 92351 


F  Value 
14. 629 


HMO -A 

Prob>F 
0.0001 


Root  MSE 
Dep  Mean 
C.V. 


0. 96099 
5. 94880 
16. 15443 


R-square 
Adj  R-sq 


0.  0885 
0. 0825 


Parameter  Estimates 


Parameter 

Standard 

T  for  HO: 

Variable 

DF 

Estimate 

Error 

Parameter=0 

Prob  >  |T| 

INTERCEP 

1 

5. 265503 

0. 

16211499 

32 . 480 

0. 0001 

AGE  91 

1 

0. 007262 

0. 

00790403 

0.  919 

0.3585 

GENDER 

1 

0.281120 

0  . 

17663010 

1.592 

0. 1119 

CROSS 91 

1 

-0. 001181 

0. 

00430998 

-0.274 

0. 7842 

ASQ91 

1 

0. 000129 

0. 

00010235 

1.260 

0.2082 

CHRONIC 

1 

0. 115468 

0. 

07541496 

1.  531 

0. 1262 

DEPENDENT  VARIABLE:   Year-2  Charges 

INDEPENDENT  VARIABLES:   Year-1  Chronic  Flag  &  A/G  Categories 


Source 

Model 
Error 
C  Total 


DF 

9 

749 
758 


Sum  of 
Squares 

59.22567 
703.72855 
762. 95422 


Mean 
Square 

6.58063 
0.93956 


F  Value 
7  .  004 


Prob>F 
0. 0001 


Root  MSE 
Dep  Mean 
C.V- 


0. 96931 
5.94880 
16.29419 


R-square 
Adj  R-sq 


0. 0776 
0.0665 


Parameter  Estimates 


Parameter 

Standard 

T  for 

HO: 

Variable 

DF 

Es 

timate 

Error 

Paramet 

er=0 

Prob  > 

|T| 

INTERCEP 

1 

5. 

695780 

0 

09999709 

56 

.  959 

0. 

0001 

AGE1 

1 

0. 

050066 

0 

37484388 

0 

.  134 

0. 

8938 

AGE  2 

1 

-0. 

504691 

0 

21139042 

-2 

.  387 

0. 

0172 

AGE  3 

1 

-0. 

422872 

0 

22211010 

-1 

.  904 

0. 

0573 

AGE  4 

1 

-0. 

201656 

0 

18523666 

-1 

.  089 

0. 

2767 

AGE  5 

1 

-0. 

083971 

0 

10024267 

-0 

.  838 

0. 

4025 

AGE  7 

1 

0. 

357418 

0 

10333216 

3 

.  459 

0. 

0006 

AGE  8 

1 

0. 

462156 

0 

23385762 

1 

.  976 

0. 

0485 

GENDER 

1 

0. 

209049 

0 

07737555 

2 

.702 

0. 

0071 

CHRONIC 

1 

0. 

143576 

0 

07634724 

1 

.  881 

0. 

0604 
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HMO-A 


fHH 


Mean  Expected  Values 


::  :.:';':>::■::■:>::':':■■■: 


A_C<i>  [  A_G<2)  j  CHR(1)  1  CMm)  1    ACQ    \  ADG(l)  I  AlK3<2> 


One-part  model  using  actual  dollars  Estimation  Half 


50k 

75.17 

75.17 

75.17 

75.17 

75.17 

75.17 

75.17 

25k 

72.60 

72.60 

72.60 

72.60 

72.60 

72.60 

72.60 

10k 

66.22 

66.22 

66.22 

66.22 

66.22 

66.22 

66.22 

5k 

57.68 

57.68 

57.68 

57.68 

57.68 

57.68 

57.68 

One-part  model  using  log  transformation 

Estimation  Half 

50k 

70.33 

69.37 

70.40 

69.60 

79.69 

77.67 

77.41 

25k 

68.26 

67.28 

68.66 

67.88 

77.65 

75.57 

75.31 

10k 

63.40 

62.35 

64.17 

63.39 

72.07 

70.30 

70.01 

5k 

56.04 

55.12 

56.96 

56.31 

63.56 

62.16 

61.91 

Two-part  model  usin; 

» log  transformation 

Estimation  Half 

50k 

72.37 

71.59 

71.25 

70.54 

73.82 

73.98 

73.73 

25k 

70.20 

69.38 

69.37 

68.66 

71.66 

71.93 

71.68 

10k 

65.14 

64.22 

64.67 

63.93 

66.38 

66.91 

66.63 

5k 

57.53 

56.74 

57.31 

56.71 

58.53 

59.19 

58.95 

Four-part  model  using  log  transformation 

Estimation  Han* 

50k 

76.51 

76.48 

75.50 

75.53 

75.32 

76.31 

75.89 

25k 

73.99 

73.84 

73.07 

72.99 

73.02 

73.79 

73.40 

10k 

67.62 

67.35 

66.89 

66.61 

66.92 

67.34 

67.10 

5k 

58.75 

58.48 

58.22 

57.88 

58.37 

58.60 

58.47 

One-part  model  using  actual  dollars  Validation  Half 


50k 

73.81 

73.62 

75.09 

75.01 

76.29 

76.38 

76.21 

25k 

71.27 

71.14 

72.46 

72.43 

73.72 

73.68 

73.56 

10k 

65.01 

64.99 

66.02 

66.07 

67.17 

67.03 

66.98 

5k 

56.72 

56.73 

57.54 

57.60 

58.44 

58.40 

58.38 

One-part  model  using  log  transformation 

Validation  Half 

50  k 

69.59 

68.75 

70.91 

70.27 

80.83 

80.07 

80.13 

25k 

67.55 

66.68 

69.16 

68.53 

78.76 

77.91 

77.95 

10k 

62.75 

61.80 

64.63 

63.99 

73.05 

72.46 

72.42 

5k 

55.48 

54.66 

57.36 

56.85 

64.33 

64.05 

63.99 

Two-part  model  using  log  transformation 

Validation  Half 

50k 

71.52 

70.83 

71.46 

70.90 

75.11 

75.73 

75.70 

25k 

69.38 

68.65 

69.57 

69.00 

72.91 

73.63 

73.59 

10k 

64.37 

63.55 

64.85 

64.25 

67.49 

68.47 

68.37 

5k 

56.87 

56.17 

57.47 

56.99 

59.43 

60.53 

60.44 

Four-part  model  using  log  transformation 

Validation  Half 

50k 

75.27 

75.19 

75.49 

75.56 

76.54 

77.85 

77.63 

25k 

72.80 

72.62 

73.05 

73.03 

74.20 

75.23 

75.04 

10k 

66.58 

66.33 

66.86 

66.61 

67.93 

68.53 

68.36 

5k 

57.96 

57.71 

58.23 

57.95 

59.19 

59.62 

59.52 
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HMO-B  Mean  Expected  Values 


HopUm 

SevfeS 

A_<i<i)  [  A_G(2)  [  CHK<1)  |  CHR<2)  1 
One-part  model  using  actual  dollars 

llllllllllll 
^  ACG 

■X'X*X-X%-X'XOX*X'X'X\*XvX-X'X'X'X X*X'X-Xv 

x  :■:  x*x  xvXvX'X-x-Xvx-X'X-x-X1 
•  -           ■      ■          ■:*  ■  .        .-.       ■.     .  ■ 

ADG(J)  j  AOO<2) 
Estimation  Half 

JUK 

93.63 

93.63 

93.63 

93.63 

7J  .UJ 

93.63 

93.63 

87.40 

87.40 

87.40 

87.40 

K7  AO 

87.40 

87.40 

10k 

77.13 

77.13 

77.13 

77.13 

77.13 

77.13 

77.13 

64.99 

64.99 

64.99 

64.99 

ftd  QQ 
\jH.yy 

64.99 

64.99 

One-part  model  using  log  transformation 

Estimation  Half 

50k 

92.38 

91.51 

91.33 

90.34 

95.24 

96.37 

95.97 

25k 

87.32 

86.28 

86.91 

85.89 

90.51 

92.08 

91.68 

10k 

78.24 

77.14 

78.60 

77.64 

81.69 

83.37 

83.01 

jk 

66.59 

65.68 

67.18 

66.42 

U7.DJ 

71.23 

70.95 

Two-part  model  using  log  transformation 

Estimation  Half 

50k 

86.30 

85.52 

85.42 

84.57 

90.11 

89.78 

25k 

81.59 

80.65 

81.28 

80.38 

84.53 

86.09 

85.75 

10k 

73.16 

72.16 

73.52 

72.67 

76.32 

77.99 

77.68 

5k 

62.36 

61.53 

62.92 

62.24 

65.17 

66.73 

66.48 

Four-part  model  using  log  transformation 

Estimation  Half 

50k 

97.19 

97.31 

96.58 

96.62 

98.94 

99.53 

99.55 

25k 

91.39 

91.28 

91.05 

90.95 

93.04 

94.14 

94.11 

10k 

80.92 

80.56 

80.91 

80.52 

82.23 

83.41 

83.32 

5k 

67.84 

67.41 

67.96 

67.48 

68.80 

69.64 

69.50 

One-part  model  using  actual  dollars  Validation  Half 


50k 

82.07 

82.50 

82.85 

83.30 

86.09 

81.90 

82.11 

25k 

77.40 

77.74 

78.08 

78.44 

80.90 

77.66 

77.80 

10k 

69.32 

69.57 

69.86 

70.11 

71.85 

69.78 

69.88 

5k 

59.01 

59.21 

59.43 

59.63 

60.89 

59.43 

59.50 

One-part  model  using  log  transformation 

Validation  Half 

50k 

83.57 

83.35 

87.75 

87.97 

99.96 

106.89 

107.50 

25k 

79.00 

78.54 

83.94 

84.09 

95.41 

102.33 

102.88 

10k 

70.80 

70.21 

76.51 

76.60 

86.60 

92.89 

93.34 

5k 

60.45 

59.99 

65.79 

65.89 

74.39 

79.47 

79.81 

Two-part  model  using  log  transformation 

Validation! 

Half 

50k 

82.85 

83.03 

82.40 

82.29 

88.88 

86.44 

86.38 

25k 

78.37 

78.33 

78.47 

78.29 

84.50 

82.75 

82.67 

10k 

70.37 

70.18 

71.12 

70.91 

76.33 

75.24 

75.16 

5k 

60.10 

59.93 

61.00 

60.85 

65.19 

64.55 

64.49 

Four-part  model  using  log  transformation 

Validation  Half 

50k 

85.52 

85.93 

85.40 

85.69 

91.08 

88.44 

88.75 

25k 

80.81 

81.02 

80.90 

81.05 

85.90 

84.24 

84.44 

10k 

72.27 

72.27 

72.59 

72.30 

76.29 

75.52 

75.57 

5k 

61.20 

61.15 

61.58 

61.32 

64.20 

63.59 

63.54 
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HMO-A  Mean  Expected  Values  (Before  Smearing) 


One-part  model  nsing  actual  dollars   Estimation  Half 


SOk 

75.17 

75.17 

75.17 

75.17 

75.17 

75.17 

75.17 

25k 

72.60 

72.60 

72.60 

72.60 

72.60 

72.60 

72.60 

10k 

66.22 

66.22 

66.22 

66.22 

66.22 

66.22 

66.22 

5k 

57.68 

57.68 

57.68 

57.68 

57.68 

57.68 

57.68 

One-part  model  using  log  transformation 

Estimation  Half 

SOk 

19.98 

19.59 

21.14 

20.92 

23.94 

25.31 

25.08 

25k 

19.96 

19.57 

21.11 

20.89 

23.90 

25.25 

25.03 

10k 

19.79 

19.43 

20.91 

20.72 

23.64 

24.92 

24.73 

5k 

19.39 

19.08 

20.45 

20.28 

22.99 

24.14 

23.97 

Two-part  model  using  log  transformation 

Estimation 

Half 

SOk 

23.93 

23.60 

24.62 

24.43 

26.31 

27.62 

27.44 

25k 

23.89 

23.57 

24.58 

24.40 

26.27 

27.57 

27.39 

10k 

23.69 

23.40 

24.36 

24.20 

25.99 

27.23 

27.07 

5k 

23.20 

22.94 

23.82 

23.68 

25.29 

26.41 

26.29 

Four-part  model  using  log  transformation   Estimation  Half 


50k 

44.38 

44.15 

44.38 

44.22 

44.94 

46.10 

45.77 

25k 

43.96 

43.73 

43.96 

43.81 

44.51 

45.63 

45.30 

10k 

41.70 

41.53 

41.75 

41.58 

42.35 

43.11 

42.94 

5k 

37.30 

37.16 

37.40 

37.21 

38.10 

38.59 

38.51 

Four-part  model  using  log  (ambulatory) 

Estimation  Hah" 

50k 

17.45 

17.32 

18.93 

18.88 

18.56 

17.79 

17.72 

25k 

17.45 

17.32 

18.93 

18.88 

18.56 

17.79 

17.72 

10k 

17.44 

17.31 

18.92 

18.87 

18.55 

17.78 

17.70 

5k 

17.38 

17.25 

18.82 

18.77 

18.45 

17.71 

17.63 

Four-part  model  using  log  (inpatient) 

Estimation  Half 

SOk 

26.93 

26.83 

25.45 

25.35 

26.38 

28.31 

28.05 

25k 

26.51 

26.41 

25.03 

24.93 

25.95 

27.84 

27.59 

10k 

24.26 

24.22 

22.83 

22.71 

23.81 

25.32 

25.24 

Sk 

19.92 

19.91 

18.58 

18.44 

19.65 

20.88 

20.89 

Four-part  model:  inpatient  as  %  of  total 

Estimation  Half 

SOk 

60.7% 

60.8% 

57.3% 

57.3% 

58.7% 

61.4% 

61.3% 

25k 

60.3% 

60.4% 

56.9% 

56.9% 

58.3% 

61.0% 

60.9% 

10k 

58.2% 

58.3% 

54.7% 

54.6% 

56.2% 

58.7% 

58.8% 

5k 

53.4% 

53.6% 

49.7% 

49.6% 

51.6% 

54.1% 

54.2% 

HMO-A 


(Before  Smearing) 


iiii^Km 

...  iiiiiii 

A_G(2)  1  CHR{1) 

AC"G  : 

1  ADG(1} 

One-part  model  using  actual  dollars 

Validation  Half 

50k 

73.81 

73.62 

75.09 

75.01 

76.29 

76.38 

76.21 

25k 

71.27 

71.14 

72.46 

72.43 

73.72 

73.68 

73.56 

10k 

65.01 

64.99 

66.02 

66.07 

67.17 

67.03 

66.98 

5k 

56.72 

56.73 

57.54 

57.60 

58.44 

58.40 

58.38 

One-part  model  using  tog  transformation 


Validation  Half 


19.77 

19.42 

21.30 

21.12 

24.28 

26.09 

25.96 

19.75 

19.40 

21.26 

21.09 

24.24 

26.04 

25.91 

19.58 

19.26 

21.06 

20.91 

23.96 

25.69 

25.58 

19.19 

18.92 

20.59 

20.47 

23.27 

24.87 

24.78 

50k 
25k 
10k 
5k 


Two-part  model  using  log  transformation 

Validation  Half 

50k 

23.65 

23.35 

24.69 

24.56 

26.77 

28.27 

28.18 

25k 

23.61 

23.32 

24.65 

24.52 

26.72 

28.22 

28.12 

10k 

23.41 

23.15 

24.43 

24.32 

26.43 

27.86 

27.78 

5k 

22.94 

22.71 

23.89 

23.80 

25.68 

27.01 

26.95 

Four-part  model  using  log  transformation 


Validation  Half 


50k 

43.60 

43.33 

44.35 

44.22 

45.72 

47.04 

46.84 

25k 

43.18 

42.93 

43.93 

43.80 

45.28 

46.54 

46.34 

10k 

40.98 

40.81 

41.70 

41.53 

43.06 

43.88 

43.74 

5k 

36.72 

36.59 

37.38 

37.21 

38.70 

39.27 

39.20 

Four-part  model  using  log  (ambulatory) 


Validation  Half 


50k 

17.32 

17.21 

17.84 

17.80 

18.71 

19.25 

19.22 

25k 

17.32 

17.21 

17.84 

17.80 

18.71 

19.25 

19.22 

10k 

17.31 

17.21 

17.83 

17.79 

18.69 

19.24 

19.21 

5k 

17.25 

17.15 

17.76 

17.72 

18.59 

19.13 

19.10 

Four-part  model  using  log  (inpatient) 


Validation  Half 


50k 

26.28 

26.12 

26.51 

26.41 

27.01 

27.79 

27.62 

25k 

25.86 

25.72 

26.09 

26.00 

26.58 

27.29 

27.12 

10k 

23.67 

23.60 

23.87 

23.74 

24.37 

24.65 

24.54 

5k 

19.47 

19.44 

19.63 

19.49 

20.11 

20.15 

20.10 

Four-part  model:  inpatient  as  %  of  total 

Validation  Half 

50k 

60.3% 

60.3% 

59.8% 

59.7% 

59.1% 

59.1% 

59.0% 

25k 

59.9% 

59.9% 

59.4% 

59.4% 

58.7% 

58.6% 

58.5% 

10k 

57.8% 

57.8% 

57.2% 

57.2% 

56.6% 

56.2% 

56.1% 

5k 

53.0% 

53.1% 

52.5% 

52.4% 

52.0% 

51.3% 

51.3% 
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HMO-B  Mean  Expected  Values  (Before  Smearing) 

llllllllll^ 

;A_G{1)  j  A_G(2)  j  CHR(l)  |  CHR<2)  [    ACG     [  ADG<1)  J  AfrGg) 


One-part  model  using  actual  dollars 


Estimation  Half 


50k 

93.63 

93.63 

93.63 

93.63 

93.63 

93.63 

93.63 

25k 

87.40 

87.40 

87.40 

87.40 

87.40 

87.40 

87.40 

10k 

77.13 

77.13 

77.13 

77.13 

77.13 

77.13 

77.13 

5k 

64.99 

64.99 

64.99 

64.99 

64.99 

64.99 

64.99 

One-part  model  asmg  log  transformation 

Estimation  Half 

50k 

19.03 

18.38 

21.35 

21.07 

24.86 

28.97 

28.81 

25k 

18.96 

18.33 

21.26 

20.99 

24.72 

28.76 

28.59 

10k 

18.75 

18.14 

20.98 

20.71 

24.28 

28.08 

27.92 

5k 

18.24 

17.68 

20.33 

20.09 

23.35 

26.76 

26.61 

Two-part  model  using  log  transformation 

Estimation  Half 

50k 

25.76 

25.38 

26.63 

26.46 

28.71 

30.87 

30.81 

25k 

25.67 

25.29 

26.52 

26.36 

28.56 

30.66 

30.60 

10k 

25.35 

25.00 

26.17 

26.01 

28.08 

30.02 

29.95 

5k 

24.62 

24.30 

25.37 

25.22 

27.04 

28.74 

28.68 

Four-part  model  using  log  transformation 


Estimation  Half 


50k 

53.75 

53.63 

54.10 

54.08 

56.23 

57.54 

57.50 

25k 

52.58 

52.44 

52.93 

52.88 

54.78 

56.09 

56.05 

10k 

49.05 

48.85 

49.40 

49.19 

50.91 

51.96 

51.90 

5k 

42.46 

42.22 

42.80 

42.53 

43.91 

44.67 

44.57 

Four-part  model  using  log   (ambulatory)   Estimation  Half 


50k 

17.09 

16.85 

17.47 

17.33 

18.16 

18.90 

18.84 

25k 

17.09 

16.85 

17.47 

17.33 

18.16 

18.90 

18.84 

10k 

17.08 

16.84 

17.46 

17.32 

18.14 

18.88 

18.82 

5k 

17.00 

16.76 

17.37 

17.24 

18.05 

18.77 

18.72 

Four-part  model  using  log   (Inpatient)  Estimation  Half 


50k 

36.66 

36.79 

36.63 

36.75 

38.08 

38.64 

38.66 

25k 

35.49 

35.60 

35.46 

35.55 

36.62 

37.19 

37.21 

10k 

31.98 

32.01 

31.94 

31.88 

32.76 

33.08 

33.08 

5k 

25.47 

25.45 

25.43 

25.29 

25.86 

25.90 

25.85 

Four-part  model:  inpatient  as  %  of  total 

Estimation  Half 

50k 

68.2% 

68.6% 

67.7% 

68.0% 

67.7% 

67.2% 

67.2% 

25k 

67.5% 

67.9% 

67.0% 

67.2% 

66.9% 

66.3% 

66.4% 

10k 

65.2% 

65.5% 

64.7% 

64.8% 

64.4% 

63.7% 

63.7% 

5k 

60.0% 

60.3% 

59.4% 

59.5% 

58.9% 

58.0% 

58.0% 
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HMO-B  Mean  Expected  Values  (Before  Smearing) 


One-part  model  using  actual  dollars   Validation  Half 


SOk 

82.07 

82.50 

82.85 

83.30 

86.09 

81.90 

82.11 

25k 

77.40 

77.74 

78.08 

78.44 

80.90 

77.66 

77.80 

10k 

69.32 

69.57 

69.86 

70.11 

71.85 

69.78 

69.88 

5k 

59.01 

59.21 

59.43 

59.63 

60.89 

59.43 

59.50 

One-part  model  using  log  transformation 

Validation  Half 

50k 

17.09 

16.78 

19.31 

19.33 

23.11 

26.05 

26.08 

25k 

17.04 

16.74 

19.24 

19.27 

23.00 

25.91 

25.93 

10k 

16.87 

16.60 

19.02 

19.05 

22.60 

25.40 

25.41 

5k 

16.47 

16.23 

18.49 

18.53 

21.77 

24.29 

24.30 

Two-part  model  using  log  transformation  Validation  Half 


50k 

23.11 

23.03 

24.02 

24.10 

26.79 

27.68 

27.73 

25k 

23.04 

22.97 

23.95 

24.03 

26.66 

27.55 

27.59 

10k 

22.80 

22.74 

23.68 

23.75 

26.23 

27.09 

27.12 

5k 

22.22 

22.17 

23.03 

23.11 

25.31 

26.04 

26.07 

Four-part  model  using  log  transformation 

Validation  Half 

50k 

46.99 

47.02 

47.53 

47.63 

51.58 

50.91 

51.06 

25k 

46.14 

46.16 

46.67 

46.74 

50.37 

49.99 

50.09 

10k 

43.45 

43.42 

43.96 

43.75 

47.02 

46.90 

46.93 

5k 

38.03 

37.97 

38.49 

38.29 

40.79 

40.70 

40.66 

Four-part  model  using  log   (ambulatory)    Validation  Half 


50k 

15.79 

15.71 

16.23 

16.21 

17.20 

17.37 

17.33 

25k 

15.79 

15.71 

16.23 

16.21 

17.20 

17.37 

17.33 

10k 

15.78 

15.71 

16.22 

16.21 

17.19 

17.36 

17.32 

5k 

15.72 

15.65 

16.15 

16.15 

17.10 

17.27 

17.23 

Four-part  model  using  log  (inpatient) 

Validation  Half 

50k 

31.20 

31.31 

31.30 

31.42 

34.38 

33.54 

33.72 

25k 

30.35 

30.44 

30.44 

30.54 

33.17 

32.62 

32.76 

10k 

27.67 

27.71 

27.73 

27.53 

29.83 

29.55 

29.60 

5k 

22.31 

22.32 

22.34 

22.14 

23.68 

23.44 

23.43 

Four-part  model:  inpatient  as  %  of  total    Validation  Half 


50k 

66.4% 

66.6% 

65.9% 

66.0% 

66.7% 

65.9% 

66.1% 

25k 

65.8% 

66.0% 

65.2% 

65.3% 

65.9% 

65.3% 

65.4% 

10k 

63.7% 

63.8% 

63.1% 

62.9% 

63.4% 

63.0% 

63.1% 

5k 

58.7% 

58.8% 

58.0% 

57.8% 

58.1% 

57.6% 

57.6% 
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HMO-A  Smearing  Factors 

A_G{1)  |  A_G(2)   j  CHR(l)  j  CHRff)  j     ACG     [  ADG(I)  j  APG{2) 


One-part  model  using  log  transformation 


50k 

3.519 

3.541 

3.330 

3.327 

3.329 

3.069 

3.086 

25k 

3.420 

3.438 

3.253 

3.250 

3.249 

2.993 

3.009 

10k 

3.204 

3.208 

3.069 

3.060 

3.049 

2.821 

2.832 

5k 

2.890 

2.890 

2.786 

2.777 

2.765 

2.575 

2.583 

Two-part  model  using  log  transformation 


50k 

3.025 

3.034 

2.894 

2.887 

2.806 

2.678 

2.686 

25k 

2.938 

2.944 

2.822 

2.814 

2.728 

2.610 

2.617 

10k 

2.749 

2.745 

2.655 

2.642 

2.554 

2.458 

2.461 

5k 

2.479 

2.473 

2.406 

2.394 

2.314 

2.241 

2.243 

Four-part  model  using  log  (ambulatory) 

50k 

2.076 

2.077 

2.017 

2.015 

1.947 

1.905 

1.905 

25k 

2.076 

2.077 

2.017 

2.015 

1.947 

1.905 

1.905 

10k 

2.067 

2.066 

2.009 

2.006 

1.941 

1.899 

1.899 

5k 

2.005 

2.004 

1.953 

1.950 

1.891 

1.854 

1.853 

Four-part  model  using  log  (inpatient) 

50k 

1.496 

1.510 

1.490 

1.503 

1.485 

1.482 

1.485 

25k 

1.424 

1.434 

1.421 

1.430 

1.422 

1.413 

1.417 

10k 

1.302 

1.304 

1.300 

1.303 

1.299 

1.298 

1.299 

5k 

1.200 

1.201 

1.200 

1.201 

1.195 

1.199 

1.200 
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HMO-B  Smearing  Factors 

A,  G(l)   1  A„,G(2)   |  €HR(1)  j  Cllft(2)  \  ACG 


APG(l)  j  APC(2) 


50k 
25k 
10k 
5k 


50k 
25k 
10k 
5k 


50k 
25k 
10k 
5k 


One-part  model  using  log  transformation 


4.890 

4.968 

4.545 

4.551 

4.325 

4.103 

4.122 

4.636 

4.692 

4.363 

4.364 

4.149 

3.950 

3.968 

4.196 

4.230 

4.024 

4.021 

3.832 

3.657 

3.673 

3.671 

3.696 

3.559 

3.556 

3.417 

3.271 

3.284 

Two-part  model  using  log  transformation 


Four-part  model  using  log  (ambulatory) 


Four-part  model  using  log  (inpatient) 


50k 

1.600 

1.609 

1.581 

1.586 

1.572 

1.542 

1.544 

25k 

1.489 

1.494 

1.478 

1.480 

1.473 

1.457 

1.458 

10k 

1.338 

1.339 

1.333 

1.334 

1.326 

1.323 

1.323 

5k 

1.227 

1.229 

1.226 

1.227 

1.221 

1.219 

1.220 

3.586 

3.605 

3.430 

3.414 

3.318 

3.122 

3.115 

3.402 

3.411 

3.277 

3.259 

3.169 

3.003 

2.996 

3.086 

3.086 

3.004 

2.985 

2.910 

2.778 

2.771 

2.704 

2.703 

2.648 

2.633 

2.576 

2.479 

2.474 

2.256 

2.263 

2.213 

2.213 

2.152 

2.113 

2.116 

2.256 

2.263 

2.213 

2.213 

2.152 

2.113 

2.116 

2.234 

2.239 

2.195 

2.194 

2.138 

2.100 

2.102 

2.152 

2.156 

2.117 

2.115 

2.063 

2.028 

2.029 

I 
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HMO-A 


Adjusted  R-Square 


A_G<1)   j  A_G<2)  |  CHR(1)  I  CHR(2)  j    ACG     [  APG(l)  {  APG(2) 


One-part  model  using  actual  dollars  Estimation  Half 


SOk 

0.034 

0.027 

0.048 

0.043 

0.062 

0.084 

0.080 

25k 

0.041 

0.034 

0.058 

0.053 

0.079 

0.104 

0.100 

10k 

0.044 

0.039 

0.066 

0.064 

0.102 

0.127 

0.124 

Sk 

0.046 

0.043 

0.072 

0.071 

0.118 

0.145 

0.143 

One-part  model  using  log  transformation 

Estimation  Half 

SOk 

0.028 

0.021 

0.045 

0.039 

0.052 

0.014 

0.008 

25k 

0.034 

0.027 

0.056 

0.049 

0.067 

0.016 

0.009 

10k 

0.032 

0.033 

0.061 

0.060 

0.080 

-0.015 

-0.014 

5k 

0.028 

0.036 

0.062 

0.067 

0.084 

-0.049 

-0.039 

Two-part  model  using  log  transformation 

Estimation  Half 

50k 

0.029 

0.023 

0.045 

0.040 

0.057 

0  059 

0.054 

25k 

0.036 

0.029 

0.056 

0.050 

0.075 

0.075 

0.070 

10k 

0.036 

0.035 

0.063 

0.062 

0.094 

0.079 

0.079 

5k 

0.035 

0.039 

0.067 

0.070 

0.106 

0.076 

0.082 

Four-part  model  using  log  transformation 

Estimation  Half 

50k 

0.033 

0.024 

0.049 

0.043 

0.060 

0.061 

0.069 

25k 

0.040 

0.031 

0.060 

0.054 

0.078 

0.081 

0.090 

10k 

0.042 

0.038 

0.067 

0.065 

0.100 

0.112 

0.116 

5k 

0.043 

0.042 

0.072 

0.072 

0.116 

0.131 

0.135 

One-part  model  using  actual  dollars   Validation  Half 


50k 

0.029 

0.024 

0.040 

0.037 

0.042 

0.065 

0.063 

25k 

0.037 

0.032 

0.051 

0.048 

0.061 

0.085 

0.083 

10k 

0.045 

0.040 

0.065 

0.063 

0.086 

0.111 

0.110 

5k 

0.047 

0.043 

0.071 

0.070 

0.097 

0.126 

0.124 

One-part  model  using  log  transformation 

Validation  Half 

SOk 

0.021 

0.017 

0.035 

0.031 

0.035 

0.038 

0.034 

25k 

0.027 

0.024 

0.046 

0.043 

0.048 

0.042 

0.037 

10k 

0.032 

0.031 

0.059 

0.058 

0.058 

0.031 

0.026 

5k 

0.031 

0.035 

0.062 

0.064 

0.058 

0.009 

0.007 

Two-part  model  using  log  transformation 

Validation  Half 

50k 

0.023 

0.019 

0.035 

0.031 

0.039 

0.055 

0.052 

25k 

0.030 

0.026 

0.046 

0.043 

0.057 

0.072 

0.068 

10k 

0.036 

0.034 

0.061 

0.059 

0.076 

0.088 

0.085 

5k 

0.037 

0.038 

0.066 

0.067 

0.082 

0.088 

0.087 

Four-part  model  using  log  transformation 

Validation ' 

Half 

SOk 

0.028 

0.023 

0.041 

0.036 

0.043 

0.061 

0.061 

25k 

0.036 

0.030 

0.053 

0.049 

0.062 

0.079 

0.078 

10k 

0.043 

0.040 

0.066 

0.065 

0.086 

0.104 

0.105 

5k 

0.046 

0.044 

0.072 

0.071 

0.097 

0.118 

0.119 
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HMO-B  Adjusted  R-Square 

>S:>::::::>i;>|:>;f;^ 

AJXD  I  A_<5<2}  ]  CTOi)  j  attR<2)4.  ACG    1  ADG(1>  j  AftO<2) 


One-part  model  using  actual  dollars   Estimation  Half 


50k 

0.030 

0.023 

0.045 

0.041 

0.063 

0.098 

0.095 

25k 

0.036 

0.029 

0.055 

0.051 

0.081 

0.117 

0.114 

10k 

0.042 

0.036 

0.065 

0.063 

0.104 

0.138 

0.137 

5k 

0.049 

0.042 

0.075 

0.073 

0.122 

0.158 

0.156 

One-part 

model  using  log  transformation 

Estimation  Half 

50k 

0.002 

0.012 

0.027 

0.028 

0.047 

-0.100 

-0.100 

25k 

-0.003 

0.014 

0.028 

0.032 

0.051 

-0.190 

-0.186 

10k 

-0.013 

0.018 

0.020 

0.030 

0.041 

-0.384 

-0.371 

5k 

-0.023 

0.018 

0.007 

0.024 

0.026 

-0.551 

-0.531 

Two-part  model  using  log  transformation 

Estimation  Half 

50k 

0.020 

0.019 

0.039 

0.038 

0.060 

0.085 

0.078 

25k 

0.024 

0.023 

0.048 

0.048 

0.076 

0.095 

0.088 

10k 

0.026 

0.028 

0.056 

0.058 

0.090 

0.085 

0.080 

5k 

0.027 

0.033 

0.061 

0.066 

0.099 

0.078 

0.076 

Four-part  model  using  log  transformation 

Estimation  Half 

50k 

0.030 

0.019 

0.047 

0.038 

0.061 

0.099 

0.083 

25k 

0.035 

0.023 

0.057 

0.047 

0.077 

0.110 

0.096 

10k 

0.039 

0.030 

0.066 

0.061 

0.097 

0.116 

0.106 

5k 

0.044 

0.037 

0.074 

0.071 

0.114 

0.135 

0.129 

One-part  model  using  actual  dollars   Validation  Half 


50k 

0.025 

0.023 

0.038 

0.038 

0.069 

0.067 

0.066 

25k 

0.031 

0.029 

0.047 

0.047 

0.084 

0.091 

0.090 

10k 

0.042 

0.041 

0.062 

0.062 

0.108 

0.123 

0.121 

5k 

0.050 

0.050 

0.074 

0.075 

0.127 

0.147 

0.145 

One-part  model  using  log  transformation 

Validation  1 

*df 

50k 

-0.012 

0.014 

0.022 

0.025 

0.046 

-0.152 

-0.162 

25k 

-0.020 

0.017 

0.023 

0.029 

0.054 

-0.206 

-0.218 

10k 

-0.035 

0.023 

0.019 

0.033 

0.054 

-0.342 

-0.354 

5k 

-0.051 

0.027 

0.009 

0.031 

0.042 

-0.495 

-0.505 

Two-part 

model  using  log  transformation 

Validation  Half 

50k 

0.020 

0.020 

0.038 

0.038 

0.066 

0.070 

0.069 

25k 

0.020 

0.024 

0.043 

0.045 

0.079 

0.078 

0.077 

10k 

0.023 

0.032 

0.053 

0.058 

0.098 

0.088 

0.089 

5k 

0.025 

0.040 

0.061 

0.069 

0.110 

0.089 

0.092 

Four-part  model  using  log  transformation 

Validation  Half 

50k 

0.030 

0.022 

0.045 

0.040 

0.067 

0.062 

0.052 

25k 

0.034 

0.027 

0.053 

0.048 

0.083 

0.079 

0.071 

10k 

0.042 

0.039 

0.066 

0.065 

0.107 

0.105 

0.100 

5k 

0.047 

0.050 

0.076 

0.078 

0.127 

0.132 

0.131 

172 


(individual  level)  Mean  Absolute  Error 

: A JGK 1) '|j  :  A  JG<2}  " [  CtfBX  1) ' ' '  j :  CHR(2j  j ; ':■  AOS  ■   j  ADG^f ' ' [ ^Mr(2) 


One-part  model  using  actual  dollars  HMO- A 


50k 

96.09 

96.24 

95.34 

95.36 

93.20 

91.90 

91.97 

25k 

89.76 

90.02 

88.98 

89.11 

86.91 

85.52 

85.67 

10k 

78.44 

78.77 

77.60 

77.80 

75.62 

74.13 

74.31 

5k 

65.41 

65.70 

64.54 

64.68 

62.63 

61.27 

61.39 

One-part  model  using  log  transformation  HMO- A 


50k 

93.80 

93.73 

92.61 

92.48 

95.92 

93.37 

93.56 

25k 

87.86 

87.77 

86.88 

86.75 

89.94 

87.38 

87.56 

10k 

77.43 

77.27 

76.65 

76.53 

78.90 

76.61 

76.77 

5k 

64.96 

64.84 

64.23 

64.17 

65.54 

64.04 

64.10 

Two-part  model  using  log  transformation 

HMO-A 

50k 

94.87 

94.88 

93.07 

92.94 

92.41 

90.62 

90.83 

25k 

88.86 

88.84 

87.24 

87.11 

86.40 

84.72 

84.93 

10k 

78.28 

78.18 

76.90 

76.75 

j  75.70 

74.27 

74.48 

5k 

65.63 

65.56 

64.42 

64.33 

62.95 

61.90 

62.03 

Four-part  model  using  log  transformation 

HMO-A 

50k 

96.66 

96.88 

95.06 

95.05 

93.11 

91.46 

91.42 

25k 

90.47 

90.62 

88.94 

88.88 

87.04 

85.32 

85.31 

10k 

79.23 

79.29 

77.81 

77.80 

75.92 

74.12 

74.17 

5k 

66.01 

66.06 

64.71 

64.67 

62.91 

61.33 

61.37 

One-part  model  using  actual  dollars 


HMO-B 


50k 

109.12 

109.01 

108.08 

107.81 

103.43 

103.09 

103.24 

25k 

102.94 

102.96 

101.74 

101.63 

97.58 

95.99 

96.27 

10k 

88.92 

89.16 

87.69 

87.79 

84.11 

81.89 

82.25 

5k 

71.18 

71.36 

70.05 

70.07 

66.93 

65.03 

65.24 

One-part  model  using  log  transformation 

HMO-B 

50k 

110.20 

110.04 

110.46 

110.50 

111.70 

115.89 

116.25 

25k 

104.25 

103.96 

105.00 

104.98 

105.73 

109.89 

110.25 

10k 

90.28 

89.95 

91.46 

91.42 

91.39 

95.42 

95.72 

5k 

72.42 

72.20 

73.34 

73.37 

72.64 

76.78 

77.00 

Two-part 

model  using  log  transformation 

HMO-B 

50k 

109.26 

109.32 

106.84 

106.54 

104.99 

101.94 

101.82 

25k 

103.38 

103.33 

101.38 

101.04 

99.36 

96.71 

96.58 

10k 

89.56 

89.43 

88.06 

87.76 

85.86 

83.61 

83.46 

5k 

71.80 

71.69 

70.56 

70.32 

68.26 

66.56 

66.45 

Four-part  model  using  log  transformation  HMO-B 


50k 

1 10.07 

110.43 

107.84 

107.93 

106.08 

102.24 

102.45 

25k 

104.12 

104.33 

102.11 

102.04 

100.02 

96.72 

96.83 

10k 

90.10 

90.12 

88.42 

88.19 

85.96 

83.11 

83.10 

5k 

72.03 

71.98 

70.62 

70.42 

68.08 

65.69 

65.62 
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(Individual  level) 


Standard  Deviation  of  Absolute  Error 


life: 


A_G(l)      A_G(2)     CHR(I)     CHR(2)       ACG  AD 


One-part  model  using  actual  dollars 


HMO-A 


50k 

258.02 

258.67 

256.67 

257.09 

256.50 

253.66 

253.91 

25k 

195.02 

195.50 

193.64 

193.91 

192.85 

190.65 

190.82 

10k 

132.59 

132.80 

131.16 

131.22 

130.05 

128.51 

128.56 

5k 

96.55 

96.62 

95.35 

95.33 

94.46 

93.24 

93.27 

One-part  model  using  log  transformation 

HMO-A 

50k 

260.11 

260.69 

258.44 

259.00 

256.66 

257.23 

257.75 

25k 

197.14 

197.52 

195.22 

195.64 

193.11 

195.10 

195.62 

10k 

134.40 

134.48 

132.31 

132.44 

130.70 

134.65 

134.99 

5k 

98.03 

97.85 

96.23 

96.11 

95.41 

100.04 

100.15 

Two-part  model  using  log  transformation    HMO-A 


50k 

259.43 

260.01 

258.28 

258.80 

257.28 

255.72 

256.08 

25k 

196.33 

196.77 

194.98 

195.39 

193.66 

192.65 

192.95 

10k 

133.47 

133.68 

131.97 

132.16 

130.92 

130.71 

130.85 

5k 

97.14 

97.12 

95.82 

95.81 

95.35 

95.64 

95.63 

Four-part  model  using  log  transformation  HMO-A 


50k 

257.98 

258.64 

256.70 

257.29 

256.55 

254.50 

254.40 

25k 

194.88 

195.40 

193.49 

193.95 

192.79 

191.58 

191.59 

10k 

132.26 

132.53 

130.92 

131.07 

129.87 

129.25 

129.13 

5k 

96.29 

96.36 

95.22 

95.32 

94.30 

93.84 

93.71 

One-part  model  using  actual  dollars   HMO-B 


50k 

241.36 

241.56 

239.82 

239.90 

236.46 

237.07 

237.12 

25k 

202.63 

202.76 

201.06 

201.06 

197.57 

197.50 

197.48 

10k 

140.43 

140.42 

139.18 

139.05 

136.09 

135.98 

135.89 

5k 

94.69 

94.58 

93.72 

93.55 

91.35 

91.28 

91.23 

One-part  model  using  log  transformation   HMO+B 


50k 

246.31 

242.47 

241.10 

240.61 

236.22 

262.94 

263.97 

25k 

208.52 

203.88 

202.61 

201.68 

197.39 

227.89 

228.99 

10k 

147.32 

141.73 

141.22 

139.73 

137.04 

171.63 

172.43 

5k 

101.43 

95.71 

96.24 

94.48 

93.84 

126.91 

127.31 

Two-part  model  usinj 

I  log  transformation 

HMO-B 

50k 

242.07 

241.88 

240.40 

240.48 

236.21 

237.22 

237.30 

25k 

203.75 

203.27 

201.73 

201.66 

197.32 

199.01 

199.05 

10k 

141.96 

141.11 

139.86 

139.55 

136.07 

138.66 

138.58 

5k 

96.18 

95.12 

94.35 

93.84 

91.81 

94.75 

94.54 

Four-part  model  using  log  transformation    HMO-B 


50k 

240.24 

241.15 

238.91 

239.67 

235.59 

238.31 

239.58 

25k 

201.66 

202.30 

200.14 

200.77 

196.52 

198.78 

199.76 

10k 

139.75 

139.94 

138.32 

138.52 

135.00 

137.13 

137.68 

5k 

94.36 

94.12 

93.09 

93.04 

90.52 

91.96 

92.13 
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(Individual  level) 


Percent  of  Absolute  Error  Within  $25 


A_C<i) 

t  1 

;::-:::::::::::;::::x;:v::x:::::::::x:x 


A_C<1)  :  A_G(2)      CHR(1)     CHR(2)  j    ACG       APG(1)  j  AP€<2) 


One-part  model  osing  actual  dollars    HMO-A 


50k 

22.9% 

22.7% 

29.4% 

28.0% 

26.0% 

37.6% 

38.1% 

25k 

23.5% 

23.5% 

29.8% 

28.2% 

27.7% 

38.2% 

38.1% 

10k 

25.1% 

25.0% 

29.0% 

28.2% 

32.4% 

38.1% 

37.9% 

5k 

28.3% 

28.2% 

31.9% 

30.8% 

41.7% 

40.8% 

41.5% 

One-part  model  using  log  transformation 

HMO-A 

50k 

16.0% 

17.0% 

19.7% 

20.3% 

24.4% 

34.5% 

33.3% 

25k 

17.2% 

18.2% 

20.9% 

22.0% 

25.2% 

35.9% 

35.1% 

10k 

19.8% 

21.2% 

25.2% 

25.2% 

28.0% 

39.4% 

38.3% 

5k 

25.4% 

26.7% 

32.9% 

32.6% 

33.0% 

44.7% 

43.5% 

Two-part  model  usin; 

I  log  transformation 

HMO-A 

50k 

15.1% 

15.9% 

17.2% 

18.5% 

26.4% 

30.4% 

28.9% 

25k 

16.2% 

17.2% 

18.3% 

L  19.7% 

27.8% 

31.8% 

31.7% 

10k 

18.8% 

20.1% 

21.1% 

22.4% 

31.5% 

35.6% 

35.1% 

5k 

23.8% 

25.1% 

28.7% 

28.5% 

39.9% 

41.6% 

40.7% 

Four-part  model  using  log  transformation  HMO-A 


50k 

19.4% 

20.1% 

22.3% 

23.9% 

23.7% 

32.5% 

30.9% 

25k 

20.3% 

20.9% 

23.0% 

24.5% 

24.7% 

33.5% 

32.1% 

10k 

22.5% 

23.9% 

24.8% 

25.3% 

33.9% 

35.8% 

34.1% 

5k 

26.5% 

27.8% 

28.8% 

29.5% 

40.6% 

40.7% 

39.9% 

One-part  model  usmg  actual  dollars  HMO-B 


50k 

21.4% 

18  8% 

24.1% 

24.9% 

23.3% 

37.3% 

37.6% 

25k 

21.3% 

19.5% 

23.8% 

24.8% 

23.7% 

36.5% 

38.1% 

10k 

21.8% 

21.2% 

25.0% 

24.7% 

33.6% 

37.0% 

37.7% 

5k 

25.8% 

24.7% 

28.9% 

29.4% 

37.8% 

40.4% 

40.9% 

One-part  model  using  log  transformation 

HMO-B 

50k 

12.6% 

13.5% 

18.5% 

20.2% 

26.9% 

33.7% 

33.7% 

25k 

14.0% 

15.4% 

20.3% 

21.5% 

28.4% 

35.2% 

35.5% 

10k 

17.4% 

19.9% 

24.0% 

24.8% 

31.1% 

38.3% 

38.1% 

5k 

24.1% 

24.8% 

34.6% 

34.0% 

35.4% 

43.2% 

42.8% 

Two-part  model  using  log  transformation 

HMO-B 

50k 

11.3% 

12.3% 

14.2% 

15.7% 

27.4% 

29.9% 

29.1% 

25k 

12.6% 

13.8% 

15.8% 

16.8% 

28.8% 

31.8% 

32.3% 

10k 

15.7% 

17.2% 

19.3% 

21.3% 

31.2% 

35.6% 

35.5% 

5k 

21.8% 

23.0% 

26.4% 

26.8% 

36.4% 

41.6% 

42.1% 

Four-part  model  using  log  transformation 

HMO-B 

50k 

15.8% 

17.9% 

18.5% 

20.2% 

30.2% 

32.2% 

33.7% 

25k 

16.9% 

18.7% 

19.6% 

21.3% 

31.1% 

33.6% 

34.4% 

10k 

19.2% 

20.6% 

22.0% 

22.5% 

33.0% 

36.2% 

36.5% 

5k 

23.8% 

24.6% 

26.6% 

27.0% 

37.4% 

41.3% 

41.3% 
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(Individual  level)  Percent  of  Absolute  Error  Within  $50 


level 

A_0(1)  |  A_G(2)  j  CHR(i)  CHR(2}j 
One-part  model  using  actual  dollars 

ADG<1) 
HMO-A 

50  k 

54.1% 

45.6% 

54.1% 

54.3% 

60.3% 

57.6% 

56.6% 

25k 

54.7% 

48.5% 

54.4% 

54.3% 

61.4% 

58.2% 

57.2% 

10k 

56.1% 

52.4% 

57.2% 

56.2% 

63.1% 

60.9% 

60.3% 

5k 

63.3% 

61.5% 

63.2% 

62.3% 

67.5% 

66.4% 

66.7% 

One-part  model  using  log  transformation    HMO^A 


50k 

58.7% 

52.0% 

60.5% 

58.3% 

56.7% 

62.8% 

62.3% 

25k 

60.8% 

54.3% 

61.3% 

60.6% 

58.0% 

63.9% 

63.2% 

10k 

64.9% 

58.4% 

63.3% 

62.5% 

61.4% 

66.1% 

65.6% 

5k 

70.3% 

67.2% 

66.9% 

67.3% 

65.6% 

69.0% 

68.9% 

Two-part  model  using  log  transformation    HMO-A 


50k 

54.7% 

47.9% 

58.7% 

56.4% 

61.4% 

63.1% 

62.7% 

25k 

57.3% 

52.0% 

59.7% 

57.6% 

62.4% 

64.2% 

63.5% 

10k 

62.2% 

57.1% 

62.7% 

61.3% 

64.8% 

66.7% 

66.1% 

5k 

68.7% 

64.9% 

67.4% 

67.4% 

68.6% 

70.3% 

70.0% 

Four-part  model  using  log  transformation  HMO-A 


50k 

52.5% 

48.8% 

56.2% 

53.6% 

60.4% 

62.9% 

61.4% 

25k 

54.2% 

50.2% 

57.5% 

54.6% 

61.2% 

63.8% 

62.4% 

10k 

58.0% 

56.6% 

60.4% 

59.1% 

63.8% 

66.5% 

65.8% 

5k 

65.2% 

62.7% 

66.3% 

66.3% 

67.5% 

70.4% 

70.5% 

One-part  model  using  actual  dollars   HMO-B 


50k 

37.6% 

37.8% 

50.0% 

47.7% 

57.2% 

55.5% 

56.3% 

25k 

40.4% 

40.2% 

51.4% 

51.1% 

58.0% 

56.6% 

57.6% 

10k 

47.2% 

43.9% 

53.5% 

52.3% 

60.7% 

59.5% 

59.8% 

5k 

58.6% 

53.2% 

60.7% 

59.4% 

65.8% 

65.0% 

64.8% 

One-part  model  nsing  log  transformation    HMO-B 


50k 

44.2% 

40.9% 

53.4% 

51.5% 

50.6% 

57.4% 

57.0% 

25k 

47.1% 

43.5% 

57.6% 

53.3% 

52.5% 

58.8% 

58.4% 

10k 

53.9% 

51.3% 

60.4% 

57.2% 

55.6% 

61.2% 

60.8% 

5k 

64.9% 

58.3% 

63.2% 

62.6% 

59.7% 

64.8% 

64.4% 

HMO-B 

two-pan  mouei  using  log  iransrormaiion 

50k 

42.0% 

38.0% 

50.9% 

48.7% 

54.2% 

58.5% 

57.9% 

25k 

46.3% 

42.3% 

54.6% 

50.8% 

56.2% 

60.1% 

59.7% 

10k 

54.1% 

50.3% 

59.0% 

55.3% 

60.3% 

63.2% 

63.2% 

5k 

64.4% 

60.6% 

63.9% 

63.0% 

64.6% 

67.6% 

67.1% 

Four-part  model  using  log  transformation 

HMO-B 

50k 

42.8% 

39.4% 

48.6% 

50.0% 

54.8% 

59.2% 

59.6% 

25k 

45.3% 

41.3% 

50.8% 

51.2% 

56.1% 

60.4% 

60.8% 

10k 

51.3% 

47.3% 

55.3% 

54.3% 

59.4% 

63.4% 

63.3% 

5k 

59.8% 

60.6% 

62.3% 

60.7% 

64.4% 

67.6% 

67.5% 

(Individual  level) 
Stopkm 
fcvel 


Percent  of  Absolute  Error  More  Than  $400 

A_<J<i)  J  A_GQ)  j  €HR(1)  j  Omm J    ACG    i  ADG(l)  I  ADG<2) 
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One-part  model  using  actual  dollars  HMO-A 


50k 

3.2% 

3.2% 

3.2% 

3.1% 

2.9% 

2.8% 

2.7% 

25k 

3.3% 

3.2% 

3.2% 

3.1% 

2.9% 

2.8% 

2.8% 

10k 

3.4% 

3.3% 

3.3% 

3.3% 

3.1% 

2.9% 

2.9% 

5k 

1.1% 

1.1% 

1.0% 

0.9% 

1.1% 

0.9% 

1.0% 

One-part  model  using  log  transformation 

HMO-A 

50k 

3.4% 

3.4% 

3.2% 

3.3% 

3.3% 

3.4% 

3.5% 

25k 

3.4% 

3.4% 

3.3% 

3.3% 

3.2% 

3.4% 

3.4% 

10k 

3.5% 

3.5% 

3.3% 

3.3% 

3.0% 

3.3% 

3.3% 

5k 

1.1% 

1.2% 

1.0% 

1.0% 

1.0% 

1.4% 

1.3% 

Two-part  model  using  log  transformation  HMO-A 


50k 

3.3% 

3.3% 

3.2% 

3.2% 

2.9% 

3.1% 

3.1% 

25k 

3.4% 

3.3% 

3.3% 

3.2% 

3.0% 

3.2% 

3.2% 

10k 

3.5% 

3.4% 

3.3% 

3.3% 

3.1% 

3.2% 

3.1% 

5k 

1.1% 

1.2% 

1.0% 

1.0% 

1.0% 

1.1% 

1.1% 

Four-part  model  using  log  transformation 

HMO- A 

50k 

3.2% 

3.2% 

3.1% 

3.1% 

2.9% 

3.1% 

3.1% 

25k 

3.2% 

3.2% 

3.1% 

3.1% 

3.0% 

3.1% 

3.1% 

10k 

3.3% 

3.3% 

3.3% 

3.3% 

3.1% 

3.0% 

3.0% 

5k 

1.1% 

1.1% 

0.9% 

0.9% 

1.1% 

0.9% 

0.9% 

One-part  model  using  actual  dollars  HMO-B 


50k 

4.4% 

4.4% 

4.2% 

4.1% 

3.8% 

4.4% 

4.4% 

25k 

4.4% 

4.4% 

4.3% 

4.2% 

3.9% 

4.2% 

4.2% 

10k 

4.5% 

4.5% 

4.5% 

4.4% 

4.2% 

4.1% 

4.1% 

5k 

1.4% 

1.4% 

1.4% 

1.5% 

1.3% 

1.2% 

1.2% 

One-part  model  using  log  transformation   HMO-B 


50k 

4.6% 

4.4% 

4.3% 

4.2% 

5.3% 

6.0% 

6.0% 

25k 

4.5% 

4.4% 

4.2% 

4.2% 

5.3% 

5.9% 

5.8% 

10k 

4.6% 

4.5% 

4.3% 

4.2% 

4.6% 

5.4% 

5.5% 

5k 

1.6% 

1.6% 

1.5% 

1.5% 

1.3% 

2.7% 

2.7% 

Two-part  model  using  log  transformation  HMOrB 


50k 

4.4% 

4.5% 

4.3% 

4.3% 

4.5% 

4.7% 

4.5% 

25k 

4.4% 

4.5% 

4.3% 

4.3% 

4.1% 

4.6% 

4.5% 

10k 

4.5% 

4.5% 

4.4% 

4.4% 

4.0% 

4.4% 

4.3% 

5k 

1.5% 

1.5% 

1.5% 

1.5% 

1.3% 

1.5% 

1.5% 

Four-part  model  using  log  transformation 

HMO-B 

50  k 

4.3% 

4.4% 

4.1% 

4.2% 

4.3% 

4.7% 

4.6% 

25k 

4.3% 

4.4% 

4.2% 

4.2% 

3.8% 

4.5% 

4.5% 

10k 

4.4% 

4.5% 

4.3% 

4.3% 

4.0% 

4.1% 

4.2% 

5k 

1.4% 

1.4% 

1.3% 

1.4% 

1.3% 

1.3% 

1.3% 

HMO-A  Mean  Forecasting  Bias 

BHWI "jj  AJC^I) J^jG(2>_  1  CHR(1)  [  CHR(2)  j     ACG     I  ADG{1)  j  APGq) 


One-part  model  using  actual  dollars  Groups  of  5000 


50k  -7.4% 

-7.7% 

-5.7% 

-5.9% 

-4.2% 

-4.2% 

-4.5% 

25k  -5.2% 

-5.4% 

-3.6% 

-3.7% 

-1.9% 

-2.1% 

-2.3% 

10k  -4.0% 

-4.1% 

-2.5% 

-2.4% 

-0.7% 

-1.0% 

-1.1% 

5k  -4.4% 

-4.4% 

-3.0% 

-2.9% 

-1.4% 

-1.5% 

-1.6% 

One-part  model  using  log  transformation 

GroupsofSOOO 

50k  -12.5% 

-13.5% 

-10.8% 

-11.6% 

1.6% 

0.5% 

0.6% 

25k  -10.0% 

-11.1% 

-7.8% 

-8.6% 

4.9% 

3.7% 

3.7% 

10k  -7.2% 

-8.6% 

-4.4% 

-5.3% 

8.1% 

7.0% 

7.0% 

5k  -6.4% 

-7.7% 

-3.2% 

-4.0% 

8.6% 

8.0% 

7.9% 

Two-part  model  using  log  transformation 

GroupsofSOOO 

50k  -10.1% 

-10.9% 

-16.9% 

-17.1% 

-5.5% 

2.7% 

2.1% 

25k  -7.6% 

-8.5% 

-14.3% 

-14.5% 

-2.8% 

5.9% 

5.3% 

10k  -4.9% 

-6.0% 

-11.2% 

-11.5% 

-0.2% 

9.2% 

8.4% 

5k  -4.1% 

-5.2% 

-9.7% 

-9.9% 

0.4% 

9.6% 

8.8% 

Four-part  model  using  log  transformation 

GroupsofSOOO 

50k  -5.5% 

-5.6% 

-5.3% 

-5.2% 

-3.8% 

-2.4% 

-2.7% 

25k  -3.2% 

-3.4% 

-2.8% 

-2.8% 

-1.2% 

-0.1% 

-0.3% 

10k  -1.7% 

-2.0% 

-1.2% 

-1.6% 

0.5% 

1.1% 

0.9% 

5k  -2.3% 

-2.7% 

-1.8% 

-2.2% 

-0.0% 

0.5% 

0.3% 

One-part  model  using  actual  dollars 

Groups  of  3000 

50k  -8.5% 

-8.8% 

-7.1% 

-7.3% 

-5.5% 

-5.6% 

-5.8% 

25k  -6.0% 

-6.2% 

-4.6% 

-4.7% 

-2.9% 

-3.1% 

-3.3% 

10k  -4.2% 

-4.3% 

-2.9% 

-2.9% 

-1.2% 

-1.5% 

-1.6% 

5k  -4.5% 

-4.5% 

-3.2% 

-3.2% 

-1.7% 

-1.9% 

-1.9% 

One-part  model  using  ktg  transformation 

Groups  of  3000 

50k  -13.8% 

-14.9% 

-12.3% 

-13.1% 

0.0% 

-1.3% 

-1.3% 

25k  -11.0% 

-12.2% 

-9.0% 

-9.8% 

3.7% 

2.2% 

2.2% 

10k  -7.6% 

-9.1% 

-5.0% 

-6.0% 

7.4% 

6.2% 

6.1% 

5k  -6.6% 

-8.0% 

-3.6% 

-4.5% 

8.2% 

7.3% 

7.2% 

Two-part  model  usinj 

*  log  transformation 

Groups  of  3000 

50k  -11.4% 

-12.3% 

-18.1% 

-18.4% 

-7.1% 

1.0% 

0.4% 

25k  -8.6% 

-9.5% 

-15.3% 

-15.5% 

-4.0% 

4.5% 

3.9% 

10k  -5.2% 

-6.5% 

-11.7% 

-12.0% 

-0.8% 

8.5% 

7.6% 

5k  -4.3% 

-5.5% 

-10.0% 

-10.2% 

-0.1% 

9.0% 

8.2% 

Four-part  model  using  log  transformation  : 

Groups  of 3000 

50k  -6.7% 

-6.9% 

-6.5% 

-6.5% 

-5.3% 

-3.8% 

-4.2% 

25k  -4.0% 

-4.3% 

-3.7% 

-3.8% 

-2.3% 

-1.1% 

-1.4% 

10k  -1.9% 

-2.4% 

-1.6% 

-2.0% 

-0.1% 

0.6% 

0.3% 

5k  -2.4% 

-2.9% 

-2.0% 

-2.5% 

-0.5% 

0.1% 

-0.1% 

HMO- A  Mean  Forecasting  Bias 

;B;:i|pil!llllIlll^ 

iiiiitiif 


One-part  model  using  actual  dollars   Groups  of  1500 


50k 

-7.5% 

-7.6% 

-5.8% 

-5.8% 

-3.8% 

-4.0% 

-4.0% 

25k 

-5.2% 

-5.3% 

-3.6% 

-3.5% 

-1.4% 

-1.7% 

-1.7% 

10k 

-4.0% 

-4.0% 

-2.5% 

-2.3% 

-0.4% 

-0.6% 

-0.6% 

5k 

-4.5% 

-4.4% 

-3.1% 

-2.9% 

-1.3% 

-1.3% 

-1.3% 

One-part  model  using  log  transformation 

Groups  of  1500 

50k 

-12.8% 

-13.7% 

-11.1% 

-11.8% 

1 .9% 

1.4% 

1.5% 

25k 

-10.3% 

-11.3% 

-8.0% 

-8.7% 

5.3% 

4.7% 

4.8% 

10k 

-7.5% 

-8.8% 

-4.6% 

-5.4% 

8.3% 

8.0% 

8.0% 

5k 

-6.8% 

-8.0% 

-3.5% 

-4.3% 

8.7% 

8.8% 

8.8% 

Two-part  model  using  log  transformation 

Groups  of  1500 

50k 

-10.4% 

-11.1% 

-17.0% 

-17.2% 

-5.3% 

3.4% 

2.8% 

25k 

-7.8% 

-8.6% 

-14.4% 

-14.5% 

-2.5% 

6.7% 

6.1% 

10k 

-5.1% 

-6.2% 

-11.3% 

-11.6% 

0.1% 

10.0% 

9.2% 

5k 

-4.4% 

-5.5% 

-9.9% 

-10.1% 

0.5% 

10.2% 

9.4% 

Four-part  model  using  log  transformation 

Groups  of  1500 

50k 

-5.7% 

-5.6% 

-5.3% 

-5.1% 

-3.5% 

-1.9% 

-2.1% 

25k 

-3.2% 

-3.3% 

-2.8% 

-2.7% 

-0.8% 

0.5% 

0.4% 

10k 

-1.7% 

-2.0% 

-1.2% 

-1.5% 

0.7% 

1.7% 

1.5% 

5k 

-2.5% 

-2.8% 

-1.9% 

-2.4% 

-0.0% 

0.8% 

0.7% 

One-part  model  using  actual  dollars    Groups  of  500 

50  k         -7.9%         -1.1%         -6.3%         -6.1%  -4.0%  -3.9%  -3.9% 

25k         -5.8%         -5.6%         -4.3%         -4.0%  -1.7%  -1.9%  -1.9% 

10k         -4.4%         -4.1%         -3.0%         -2.7%  -0.5%  -0.8%  -0.8% 

5k          -4.3%         -4.0%         -3.0%         -2.7%  -0.9%  -1.0%  -0.9% 


One-part  model  usinf 

\  log  transformation 

Groups  of  500 

50k 

-12.7% 

-13.2% 

-11.1% 

-11.4% 

1.5% 

0.6% 

1.0% 

25k 

-10.3% 

-10.9% 

-8.2% 

-8.6% 

4.7% 

3.6% 

4.0% 

10k 

-7.5% 

-8.4% 

-4.7% 

-5.2% 

7.7% 

6.9% 

7.2% 

5k 

-6.3% 

-7.2% 

-3.2% 

-3.6% 

8.4% 

8.1% 

8.3% 

Two-part  model  using  log  transformation 

Groups  of  500 

50k 

-10.2% 

-10.6% 

-17.0% 

-16.9% 

-5.3% 

3.0% 

2.7% 

25k 

-7.8% 

-8.3% 

-14.5% 

-14.4% 

-2.7% 

6.1% 

5.8% 

10k 

-5.1% 

-5.8% 

-11.5% 

-11.4% 

-0.1% 

9.4% 

8.9% 

5k 

-4.0% 

-4.7% 

-9.6% 

-9.5% 

0.7% 

10.0% 

9.5% 

Four-part  model  using  log  transformation 

Groups  of  500 

50k 

-6.1% 

-5.7% 

-5.8% 

-5.3% 

-3.5% 

-2.2% 

-2.1% 

25k 

-3.7% 

-3.5% 

-3.4% 

-3.0% 

-1.0% 

0.0% 

0.2% 

10k 

-2.1% 

-2.0% 

-1.7% 

-1.8% 

0.7% 

1.3% 

1.3% 

5k 

-2.3% 

-2.3% 

-1.8% 

-2.0% 

0.5% 

1.0% 

1.1% 

HMO-B  Mean  Forecasting  Bias 


Stop  loss 
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kvrf    ]  A  G{i} 

i 

A  G(2)  j 

CHR(2)": 

1  acg 

ADGfI>  I  ADG(2) 

One-part  model  using  actual  dollars   Groups  of  5000 


50k 

-6.1% 

-5.5% 

-5.3% 

-4.7% 

-1.6% 

-6.6% 

-6.3% 

25k 

-8.1% 

-7.6% 

-7.4% 

-7.0% 

-4.1% 

-8.1% 

-7.9% 

10k 

-7.9% 

-7.5% 

-7.3% 

-6.9% 

-4.7% 

-7.5% 

-7.4% 

5k 

-6.8% 

-6.4% 

-6.2% 

-5.9% 

-3.9% 

-6.3% 

-6.2% 

One-part  model  using  log  transformation 

Groups  of  5000 

50k 

-4.4% 

-4.6% 

0.2% 

0.5% 

14.2% 

21.7% 

22.4% 

25k 

-6.2% 

-6.8% 

-0.6% 

-0.4% 

13.0% 

20.9% 

21.5% 

10k 

-6.0% 

-6.7% 

1.4% 

1.5% 

14.8% 

22.8% 

23.4% 

5k 

-4.5% 

-5.3% 

3.7% 

3.9% 

17.3% 

24.9% 

25.4% 

Two-part  model  using  log  transformation 

Groups  of  5000 

50k 

-5.2% 

-4.9% 

-14.3% 

-14.2% 

1.5% 

8.3% 

7.9% 

25k 

-7.0% 

-7.0% 

-14.8% 

-14.6% 

0.1% 

6.8% 

6.3% 

10k 

-6.5% 

-6.8% 

-12.8% 

-12.7% 

1.2% 

7.7% 

7.1% 

5k 

-5.1% 

-5.3% 

-9.9% 

-9.8% 

2.8% 

8.5% 

8.0% 

Four-part  model  using  log  transformation 

Groups  of  5000 

50k 

-2.1% 

-1.6% 

-2.3% 

-2.0% 

4.1% 

0.9% 

1.3% 

25k 

-4.0% 

-3.7% 

-4.0% 

-3.8% 

1.9% 

-0.3% 

-0.0% 

10k 

-4.0% 

-3.9% 

-3.7% 

-4.0% 

1.2% 

0.0% 

0.1% 

5k 

-3.3% 

-3.4% 

-2.8% 

-3.2% 

1.3% 

0.2% 

0.1% 

One-part  model  using  actual  dollars  Groups  of  3000 


50k 

-6.4% 

-5.9% 

-5.6% 

-5.0% 

-2.1% 

-6.8% 

-6.5% 

25k 

-8.5% 

-8.1% 

-7.8% 

-7.3% 

-4.6% 

-8.3% 

-8.1% 

10k 

-8.4% 

-8.0% 

-7.7% 

-7.4% 

-5.3% 

-7.8% 

-7.7% 

5k 

-7.3% 

-6.9% 

-6.7% 

-6.3% 

-4.6% 

-6.7% 

-6.6% 

One-part  model  using  log  transformation 

Groups  of  3000 

50k 

-4.8% 

-5.2% 

-0.1% 

0.1% 

13.5% 

21.4% 

22.0% 

25k 

-6.7% 

-7.4% 

-0.9% 

-0.8% 

12.3% 

20.5% 

21.1% 

10k 

-6.5% 

-7.4% 

1.0% 

1.0% 

14.0% 

22.3% 

22.8% 

5k 

-5.0% 

-5.9% 

3.3% 

3.4% 

16.5% 

24.5% 

25.0% 

Two-part  model  using  log  transformation 

Groups  of  3000 

50k 

-5.6% 

-5.5% 

-14.6% 

-14.6% 

0.9% 

8.0% 

7.6% 

25k 

-7.5% 

-7.6% 

-15.1% 

-15.1% 

-0.5% 

6.4% 

6.0% 

10k 

-7.1% 

-7.4% 

-13.2% 

-13.2% 

0.5% 

7.3% 

6.7% 

5k 

-5.6% 

-5.9% 

-10.4% 

-10.3% 

2.1% 

8.1% 

7.6% 

Four-part  model  using  log  transformation 

Groups  of  3000 

50k 

-2.4% 

-1.9% 

-2.6% 

-2.2% 

3.5% 

0.7% 

1.1% 

25k 

-4.4% 

-4.1% 

-4.4% 

-4.1% 

1.2% 

-0.5% 

-0.3% 

10k 

-4.4% 

-4.4% 

-4.1% 

-4.5% 

0.5% 

-0.3% 

-0.3% 

5k 

-3.8% 

-3.9% 

-3.3% 

-3.7% 

0.6% 

-0.2% 

-0.3% 

180 


HMO-B  Mean  Forecasting  Bias 

level     \    A_G(1)   1   A  G(2)   \  CHRQ)  [  CHR(2)  j     ACG     1  ADGjl)  j  APG{2) 


One-part  model  using  actual  dollars  Groups  of  1500 


50k 

-6.4% 

-5.8% 

-6.1% 

-5.5% 

-2.3% 

-7.8% 

-7.4% 

25k 

-8.2% 

-7.7% 

-7.9% 

-7.4% 

-4.6% 

-8.9% 

-8.6% 

10k 

-7.9% 

-7.5% 

-7.6% 

-7.3% 

-5.1% 

-8.2% 

-8.0% 

5k 

-6.2% 

-5.9% 

-6.0% 

-5.6% 

-3.8% 

-6.4% 

-6.3% 

One-part  model  using  log  transformation 

Groups  of  1500 

50k 

-4.6% 

-4.5% 

-0.4% 

0.0% 

13.0% 

19.3% 

19.9% 

25k 

-6.3% 

-6.6% 

-1.0% 

-0.7% 

12.1% 

18.7% 

19.3% 

10k 

-6.2% 

-6.6% 

0.9% 

1.1% 

13.7% 

20.5% 

21.1% 

5k 

-4.2% 

-4.7% 

3.7% 

3.9% 

16.7% 

23.3% 

23.9% 

Two-part  model  using  log  transformation 

Groups  of  1500 

50k 

-5.3% 

-4.9% 

-14.6% 

-14.5% 

0.8% 

7.1% 

6.8% 

25k 

-6.9% 

-6.8% 

-15.0% 

-14.8% 

-0.5% 

5.8% 

5.4% 

10k 

-6.6% 

-6.7% 

-13.1% 

-12.9% 

0.5% 

6.7% 

6.1% 

5k 

-4.7% 

-4.8% 

-9.8% 

-9.6% 

2.5% 

8.1% 

7.6% 

Four-part  model  using  log  transformation 

Groups  of  1500 

50k 

-2.3% 

-1.8% 

-2.9% 

-2.6% 

3.2% 

-0.0% 

0.3% 

25k 

-4.1% 

-3.8% 

-4.4% 

-4.2% 

1.2% 

-1.0% 

-0.8% 

10k 

-4.0% 

-4.0% 

-4.0% 

-4.4% 

0.5% 

-0.6% 

-0.5% 

5k 

-2.8% 

-2.8% 

-2.6% 

-3.0% 

1.2% 

0.2% 

0.1% 

One-part  model  using  actual  dollars  Groups  of  500 


50k 

-6.2% 

-6.4% 

-4.4% 

-4.5% 

-2.5% 

-6.7% 

-6.9% 

25k 

-8.3% 

-8.4% 

-6.7% 

-6.8% 

-4.9% 

-8.6% 

-8.7% 

10k 

-7.5% 

-7.5% 

-6.1% 

-6.2% 

-4.8% 

-7.7% 

-7.7% 

5k 

-6.2% 

-6.2% 

-4.9% 

-5.0% 

-3.8% 

-6.3% 

-6.4% 

One-part  model  using  log  transformation  Groups  of  500 


50k 

-3.1% 

-2.9% 

2.6% 

3.2% 

14.0% 

20.6% 

21.6% 

25k 

-5.0% 

-5.1% 

1.7% 

2.2% 

12.7% 

19.6% 

20.5% 

10k 

-4.2% 

-4.6% 

4.2% 

4.6% 

15.0% 

22.0% 

22.8% 

5k 

-2.8% 

-3.1% 

6.5% 

6.9% 

17.4% 

24.1% 

24.9% 

Two-part  model  usin; 

*  log  transformation 

Groups  of  500 

50k 

-4.3% 

-4.1% 

-12.9% 

-12.8% 

1.6% 

7.9% 

7.5% 

25k 

-6.1% 

-6.2% 

-13.4% 

-13.4% 

0.0% 

6.2% 

5.8% 

10k 

-5.1% 

-5.4% 

-10.9% 

-10.9% 

1.6% 

7.6% 

7.1% 

5k 

-3.7% 

-4.0% 

-8.1% 

-8.1% 

3.1% 

8.5% 

8.0% 

Four-part  model  using  log  transformation 

Groups  of  500 

50k 

-2.4% 

-2.2% 

-2.0% 

-2.1% 

3.1% 

-0.8% 

-0.6% 

25k 

-4.2% 

-4.3% 

-3.7% 

-3.9% 

0.8% 

-2.0% 

-1.9% 

10k 

-3.4% 

-3.6% 

-2.6% 

-3.0% 

0.9% 

-0.9% 

-0.9% 

5k 

-2.5% 

-2.8% 

-1.5% 

-2.0% 

1.3% 

-0.4% 

-0.5% 
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level     :j  A  G(Q   j  A  G(2)  1  CHR(l)  [  CHR(2)  |     ACG     |  ADG(I)  j  ADG(2) 


One-part  model  asing  actual  doUars 


Groups  of  5000 


50k 

0.7% 

0.7% 

0.5% 

0.5% 

0.3% 

0.3% 

0.3% 

25k 

0.4% 

0.4% 

0.2% 

0.2% 

0.2% 

0.2% 

0.2% 

10k 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

5k 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

One-part  model  using  log  transformation  Groups  of  5000 


50k 

1.7% 

2.0% 

1.3% 

1.5% 

0.2% 

0.2% 

0.2% 

25k 

1.1% 

1.4% 

0.7% 

0.9% 

0.4% 

0.3% 

0.3% 

10k 

0.6% 

0.8% 

0.3% 

0.4% 

0.8% 

0.6% 

0.6% 

5k 

0.5% 

0.7% 

0.2% 

0.2% 

0.8% 

0.8% 

0.8% 

Two-part  model  using  log  transformation 

Groups  of  5000 

1.2% 

1.3% 

3.0% 

3.0% 

0.5% 

0.3% 

0.2% 

0.7% 

0.8% 

2.1% 

2.2% 

0.2% 

0.5% 

0.5% 

0.3% 

0.4% 

1.3% 

1.4% 

0.1% 

1.0% 

0.8% 

0.2% 

0.3% 

1.0% 

1.0% 

0.1% 

1.0% 

0.9% 

Four-part  model  using  log  transformation  Groups  of  5000 


50k 

0.4% 

0.5% 

0.4% 

0.4% 

0.3% 

0.2% 

0.2% 

25k 

0.2% 

0.2% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

10k 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

5k 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

50k 
25k 
10k 
5k 


50k 
25k 
10k 
5k 


50k 
25k 
10k 
5k 


One-part 

model  usini 

;actual  dollars 

Groups  of 

3000 

0.9% 

1.0% 

0.7% 

0.7% 

0.5% 

0.5% 

0.6% 

0.5% 

0.6% 

0.4% 

0.4% 

0.3% 

0.3% 

0.3% 

0.3% 

0.3% 

0.2% 

0.2% 

0.2% 

0.2% 

0.2% 

0.3% 

0.3% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

One-part  model  using  log  transformation 

Groups  of  3000 

2.1% 

2.4% 

1.7% 

1.9% 

0.3% 

0.3% 

0.3% 

1.4% 

1.6% 

1.0% 

1.1% 

0.4% 

0.3% 

0.3% 

0.7% 

1.0% 

0.4% 

0.5% 

0.8% 

0.6% 

0.6% 

0.5% 

0.8% 

0.2% 

0.3% 

0.8% 

0.8% 

0.7% 

Two-part  model  using  log  transformation 

Groups  of 3000 

1.5% 

1.7% 

3.5% 

3.5% 

0.8% 

0.3% 

0.3% 

0.9% 

1.1% 

2.5% 

2.5% 

0.4% 

0.5% 

0.4% 

0.4% 

0.6% 

1.5% 

1.5% 

0.2% 

0.9% 

0.8% 

0.3% 

0.4% 

1.1% 

1.1% 

0.1% 

1.0% 

0.8% 

Four-part  model  using  log  transformation  Groups  of 3000 


50k 

0.7% 

0.7% 

0.6% 

0.6% 

0.5% 

0.4% 

0.4% 

25k 

0.3% 

0.4% 

0.3% 

0.3% 

0.3% 

0.2% 

0.2% 

10k 

0.2% 

0.2% 

0.2% 

0.2% 

0.2% 

0.1% 

0.1% 

5k 

0.2% 

0.2% 

0.1% 

0.2% 

0.1% 

0.1% 

0.1% 
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Stoptoss  ;  •  ;  . 

level     \l  A  G(l)  : I  A. .  G{1}':  f  '  CfiR(l )  | ' ' CHR(2> ''j ' ' ' -ACG :  :  |  APGQV  j  i'jjkS 


One-part  model  using  actual  dollars  Groups  of  1500 


50k 

1.2% 

1.2% 

1.0% 

1.0% 

0.9% 

0.8% 

0.8% 

25k 

0.7% 

0.7% 

0.6% 

0.6% 

0.6% 

0.5% 

0.5% 

10k 

0.5% 

0.5% 

0.4% 

0.4% 

0.4% 

0.3% 

0.3% 

5k 

0.4% 

0.4% 

0.3% 

0.3% 

0.3% 

0.3% 

0.3% 

One-part  model  using  log  transformation 

Groups  of  1500 

50k 

2.2% 

2.5% 

1.8% 

2.1% 

1.0% 

1.0% 

1.1% 

25k 

1.5% 

1.8% 

1.1% 

1.3% 

1.1% 

1.1% 

1.1% 

10k 

0.9% 

1.1% 

0.5% 

0.7% 

1.3% 

1.3% 

1.4% 

5k 

0.7% 

0.9% 

0.4% 

0.5% 

1.2% 

1.3% 

1.4% 

Two-part  model  using  log  transformation 

Groups  of  1500 

50k 

1.7% 

1.9% 

3.4% 

3.5% 

1.1% 

1.0% 

1.0% 

25k 

1.1% 

1.3% 

2.5% 

2.5% 

0.8% 

1.2% 

1.1% 

10k 

0.6% 

0.7% 

1.6% 

1.6% 

0.5% 

1.5% 

1.4% 

5k 

0.4% 

0.6% 

1.2% 

1.2% 

0.4% 

1.5% 

1.3% 

Four-part  model  using  log  transformation 

Groups  of  1500 

50k 

1.0% 

1.0% 

0.9% 

0.9% 

0.9% 

0.7% 

0.7% 

25k 

0.6% 

0.6% 

0.6% 

0.6% 

0.7% 

0.5% 

0.5% 

10k 

0.3% 

0.4% 

0.3% 

0.4% 

0.4% 

0.4% 

0.4% 

5k 

0.3% 

0.3% 

0.3% 

0.3% 

0.4% 

0.3% 

0.3% 

One-part  model  using  actual  dollars    Groups  of  500 


50k 

2.6% 

2.7% 

-  2.4% 

2.5% 

2.2% 

2.0% 

2.0% 

25k 

1.7% 

1.8% 

1.6% 

1.6% 

1.4% 

1.3% 

1.3% 

10k 

1.1% 

1.1% 

1.0% 

1.0% 

0.8% 

0.8% 

0.8% 

5k 

0.8% 

0.8% 

0.7% 

0.7% 

0.6% 

0.6% 

0.6% 

One-part  model  using  log  transformation 

Groups  of  500 

50k 

3.4% 

3.6% 

3.0% 

3.2% 

2.5% 

2.5% 

2.6% 

25k 

2.3% 

2.5% 

1.9% 

2.1% 

2.0% 

2.0% 

2.1% 

10k 

1.4% 

1.6% 

1.1% 

1.2% 

1.8% 

1.8% 

2.0% 

5k 

1.1% 

1.2% 

0.8% 

0.8% 

1.8% 

1.9% 

2.0% 

Two-part  model  using 

>  log  transformation 

Groups  of  500 

50k 

2.9% 

3.1% 

4.5% 

4.5% 

2.5% 

2.4% 

2.5% 

25k 

1.9% 

2.1% 

3.2% 

3.2% 

1.6% 

2.0% 

2.0% 

10k 

1.2% 

1.3% 

2.1% 

2.1% 

1.0% 

2.0% 

1.9% 

5k 

0.9% 

0.9% 

1.5% 

1.5% 

0.8% 

2.0% 

1.9% 

Four-part  model  using  log  transformation 

Groups  of  500 

50k 

2.4% 

2.4% 

2.3% 

2.3% 

2.2% 

2.0% 

2.0% 

25k 

1.6% 

1.6% 

1.5% 

1.5% 

1.5% 

1.3% 

1.3% 

10k 

1.0% 

1.0% 

0.9% 

0.9% 

0.9% 

0.8% 

0.8% 

5k 

0.8% 

0.7% 

0.7% 

0.7% 

0.7% 

0.6% 

0.6% 
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Stopfem  ::::-|llltlll|illll!llll! 
|j:S^-iiI:  AG<n  \  JlM2)  1  CHRfl)  t  ■■jCHR^fj^-  ACG    f  ^APG(I)  J:  ADG(2) 


One-part  model  using  actual  dollars  Groups  of  5000 


50k 

0.5% 

0.4% 

0.4% 

0.3% 

0.1% 

0.5% 

0.5% 

25k 

0.7% 

0.7% 

0.6% 

0.5% 

0.2% 

0.7% 

0.7% 

10k 

0.7% 

0.6% 

0.6% 

0.5% 

0.3% 

0.6% 

0.6% 

5k 

0.5% 

0.5% 

0.4% 

0.4% 

0.2% 

0.4% 

0.4% 

One-part  model  using  log  transformation 

Groups  of  5000 

50k 

0.3% 

0.3% 

0.1% 

0.1% 

2.1% 

4.9% 

5.2% 

25k 

0.5% 

0.5% 

0.1% 

0.1% 

1.8% 

4.5% 

4.8% 

10k 

0.4% 

0.5% 

0.1% 

0.1% 

2.3% 

5.3% 

5.6% 

5k 

0.3% 

0.3% 

0.2% 

0.2% 

3.0% 

6.3% 

6.6% 

Two-part  model  using  log  transformation 

Groups  of 5000 

50k 

0.4% 

0.3% 

2.1% 

2.1% 

0.1% 

0.8% 

0.7% 

25k 

0.6% 

0.6% 

2.2% 

2.2% 

0.1% 

0.5% 

0.5% 

10k 

0.5% 

0.5% 

1.7% 

1.7% 

0.1% 

0.6% 

0.6% 

5k 

0.3% 

0.3% 

1.0% 

1.0% 

0.1% 

0.8% 

0.7% 

Four-part  model  using  log  transformation    Groups  of  5000 


50k 

0.1% 

0.1% 

0.1% 

0.1% 

0.3% 

0.1% 

0.1% 

25k 

0.2% 

0.2% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

10k 

0.2% 

0.2% 

0.2% 

0.2% 

0.1% 

0.0% 

0.0% 

5k 

0.2% 

0.2% 

0.1% 

0.2% 

0.1% 

0.0% 

0.0% 

One-part  model  using  actual  dollars   Groups  of  3000 


50k 

0.6% 

0.6% 

0.5% 

0.5% 

0.3% 

0.7% 

0.7% 

25k 

0.9% 

0.8% 

0.8% 

0.7% 

0.4% 

0.9% 

0.8% 

10k 

0.8% 

0.8% 

0.7% 

0.7% 

0.4% 

0.7% 

0.7% 

5k 

0.6% 

0.6% 

0.5% 

0.5% 

0.3% 

0.5% 

0.5% 

One-part  model  using  log  transformation 

Groups  of 3000 

50k 

0.4% 

0.5% 

0.2% 

0.2% 

2.2% 

5.1% 

5.4% 

25k 

0.6% 

0.7% 

0.2% 

0.2% 

1.8% 

4.7% 

4.9% 

10k 

0.6% 

0.7% 

0.2% 

0.2% 

2.2% 

5.3% 

5.6% 

5k 

0.4% 

0.5% 

0.2% 

0.2% 

2.9% 

6.3% 

6.5% 

Two-part  model  usinj 

» log  transformation 

Groups  of  3000 

50k 

0.5% 

0.5% 

2.3% 

2.3% 

0.3% 

1.0% 

0.9% 

25k 

0.7% 

0.7% 

2.4% 

2.4% 

0.2% 

0.7% 

0.6% 

10k 

0.6% 

0.7% 

1.9% 

1.8% 

0.1% 

0.7% 

0.6% 

5k 

0.4% 

0.5% 

1.2% 

1.1% 

0.1% 

0.8% 

0.7% 

Four-part  model  using  log  transformation 

Groups  of  3000 

50k 

0.3% 

0.3% 

0.3% 

0.3% 

0.4% 

0.3% 

0.3% 

25k 

0.4% 

0.4% 

0.4% 

0.4% 

0.2% 

0.2% 

0.2% 

10k 

0.4% 

0.3% 

0.3% 

0.3% 

0.1% 

0.1% 

0.1% 

5k 

0.3% 

0.3% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 
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|- A  ^ 

A  G(2) 

CHR(1) 

CHR(2) 

ACG 

I  ADG(I) 

ADG(2) 

One-part  model  using  actual  dollars 

Groups  of  1500 

50k 

1.0% 

0.9% 

0.9% 

0.8% 

0.7% 

1.2% 

1.1% 

25k 

1.1% 

1.0% 

1.0% 

0.9% 

0.7% 

1.2% 

1.2% 

10k 

0.9% 

0.9% 

0.9% 

0.8% 

0.6% 

0.9% 

0.9% 

5k 

0.7% 

0.6% 

0.6% 

0.5% 

0.4% 

0.6% 

0.6% 

One-part  model  using  log  transformation 

Groups  of  1500 

50k 

0.9% 

0.9% 

0.7% 

0.7% 

2.6% 

4.6% 

4.9% 

25k 

1.0% 

1.0% 

0.6% 

0.6% 

2.2% 

4.2% 

4.5% 

10k 

0.8% 

0.9% 

0.4% 

0.4% 

2.4% 

4.7% 

5.0% 

5k 

0.6% 

0.6% 

0.5% 

0.5% 

3.2% 

5.9% 

6.1% 

Two-part  model  using  log  transformation 

Groups  of  1500 

50k 

0.9% 

0.9% 

2.6% 

2.6% 

0.8% 

1.2% 

1.2% 

25k 

1.0% 

1.0% 

2.6% 

2.6% 

0.6% 

0.9% 

0.9% 

10k 

0.8% 

0.8% 

2.0% 

2.0% 

0.4% 

0.8% 

0.7% 

5k 

0.6% 

0.6% 

1.2% 

1.2% 

0.4% 

0.9% 

0.9% 

Four-part  model  using  log  transformation 

Groups  of  1500 

50k 

0.7% 

0.7% 

0.6% 

0.6% 

0.8% 

0.6% 

0.7% 

25k 

0.7% 

0.6% 

0.6% 

0.6% 

0.6% 

0.5% 

0.5% 

10k 

0.5% 

0.5% 

0.5% 

0.5% 

0.4% 

0.3% 

0.3% 

5k 

0.4% 

0.4% 

0.3% 

0.3% 

0.3% 

0.2% 

0.2% 

One-part  model  using  actual  dollars   Groups  of  500 


50k 

1.7% 

1 .7% 

1.8% 

1.8% 

2.0% 

2.5% 

2.5% 

25k 

1.8% 

1.8% 

1.6% 

1.6% 

1.6% 

2.1% 

2.1% 

10k 

1.4% 

1.4% 

1.2% 

1.2% 

1.1% 

1.4% 

1.4% 

5k 

1.1% 

1.1% 

0.9% 

0.9% 

0.9% 

1.0% 

1.0% 

One^part  model  using  log  transformation  Groups  of  500 


50k 

1.9% 

1.7% 

2.2% 

2.3% 

5.4% 

9.3% 

10.0% 

25k 

1.8% 

1.6% 

1.7% 

1.8% 

4.3% 

7.9% 

8.6% 

10k 

1.5% 

1.3% 

1.4% 

1.5% 

4.1% 

7.6% 

8.3% 

5k 

1.2% 

1.0% 

1.5% 

1.6% 

4.6% 

8.1% 

8.7% 

Two-part  model  using 

»  log  transformation 

Groups  of  500 

50k 

1.7% 

1.6% 

3.0% 

3.0% 

2.5% 

3.3% 

3.3% 

25k 

1.6% 

1.6% 

2.9% 

2.8% 

1.9% 

2.4% 

2.4% 

10k 

1.3% 

1.2% 

2.0% 

2.0% 

1.3% 

1.9% 

1.9% 

5k 

1.0% 

1.0% 

1.3% 

1.3% 

1.2% 

1.7% 

1.7% 

Four-part  model  using  log  transformation  Groups  of  500 


50k 

1.5% 

1.5% 

1.5% 

1.5% 

2.3% 

2.1% 

2.1% 

25k 

1.4% 

1.3% 

1.3% 

1.3% 

1.7% 

1.5% 

1.6% 

10k 

1.1% 

1.0% 

0.9% 

0.9% 

1.1% 

0.9% 

0.9% 

5k 

0.9% 

0.8% 

0.7% 

0.7% 

0.9% 

0.7% 

0.7% 

HMO-A  Percent  of  Groups  within  5%  of  Actual 

x xx-x-x-xi-xxx  x  x-xx-x-:  x-x-xx  x-x-xx*  x-x  xx-xx-xixxx-x-x-x-x  XvX-'  x-xxx-X'X-xx     x-xvX-XvX  y.-y.-yy.- xx-x-Xv> 


ill 
ill 

llllllllil: 

level 

A  G{2)  1  CHR(l)  !  CHR{2)  j     ACG     |  ADG(l) 

ADG<2)1 

One-part  model  using  actual  dollars   Groups  of  5000 


50k 

21.7% 

18.3% 

40.0% 

45.0% 

55.0% 

60.0% 

56.7% 

25k 

48.3% 

45.0% 

60.0% 

58.3% 

75.0% 

76.7% 

78.3% 

10k 

61.7% 

65.0% 

81.7% 

83.3% 

90.0% 

93.3% 

91.7% 

5k 

63.3% 

65.0% 

76.7% 

78.3% 

86.7% 

90.0% 

90.0% 

One-part  model  using  log  transformation    Groups  of  5000 


50k 

3.3% 

3.3% 

6.7% 

3.3% 

68.3% 

71.7% 

71.7% 

25k 

8.3% 

5.0% 

16.7% 

13.3% 

53.3% 

63.3% 

63.3% 

10k 

18.3% 

11.7% 

55.0% 

50.0% 

25.0% 

33.3% 

31.7% 

5k 

31.7% 

13.3% 

73.3% 

66.7% 

18.3% 

20.0% 

23.3% 

Two-part  model  using  log  transformation 

Groups  of  5000 

50k 

6.7% 

6.7% 

0.0% 

0.0% 

40.0% 

66.7% 

66.7% 

25k 

18.3% 

15.0% 

0.0% 

0.0% 

66.7% 

43.3% 

48.3% 

10k 

55.0% 

41.7% 

0.0% 

0.0% 

86.7% 

13.3% 

18.3% 

5k 

63.3% 

55.0% 

0.0% 

0.0% 

91.7% 

6.7% 

15.0% 

Four-part  model  using  log  transformation 

Groups  of  5000 

50k 

41.7% 

41.7% 

43.3% 

46.7% 

58.3% 

70.0% 

66.7% 

25k 

68.3% 

66.7% 

68.3% 

66.7% 

75.0% 

85.0% 

85.0% 

10k 

88.3% 

83.3% 

91.7% 

86.7% 

90.0% 

88.3% 

91.7% 

5k 

83.3% 

80.0% 

88.3% 

86.7% 

96.7% 

96.7% 

96.7% 

One-part  model  using  actual  dollars   .   Groups  of  3000 


50k 

23.3% 

23.3% 

31.7% 

28.3% 

35.0% 

43.3% 

41.7% 

25k 

41.7% 

38.3% 

55.0% 

56.7% 

61.7% 

65.0% 

63.3% 

10k 

53.3% 

56.7% 

66.7% 

65.0% 

78.3% 

80.0% 

78.3% 

5k 

53.3% 

51.7% 

70.0% 

70.0% 

81.7% 

85.0% 

81.7% 

One-part  model  using  log  transformation 

Groups  of  3000 

50k 

3.3% 

1.7% 

6.7% 

3.3% 

71.7% 

65.0% 

68.3% 

25k 

8.3% 

5.0% 

15.0% 

10.0% 

66.7% 

61.7% 

63.3% 

10k 

25.0% 

15.0% 

45.0% 

36.7% 

38.3% 

48.3% 

51.7% 

5k 

28.3% 

20.0% 

65.0% 

53.3% 

30.0% 

30.0% 

31.7% 

Two-part 

model  usin; 

>  log  transformation 

Groups  of  3000 

50k 

10.0% 

5.0% 

0.0% 

0.0% 

23.3% 

65.0% 

63.3% 

25k 

23.3% 

16.7% 

0.0% 

0.0% 

45.0% 

55.0% 

65.0% 

10k 

41.7% 

33.3% 

3.3% 

3.3% 

71.7% 

16.7% 

28.3% 

5k 

61.7% 

43.3% 

6.7% 

3.3% 

81.7% 

15.0% 

16.7% 

Four-part  model  using  log  transformation 

Groups  of  3000 

50k 

35.0% 

35.0% 

36.7% 

36.7% 

35.0% 

50.0% 

48.3% 

25k 

55.0% 

51.7% 

56.7% 

56.7% 

65.0% 

80.0% 

76.7% 

10k 

78.3% 

68.3% 

83.3% 

76.7% 

78.3% 

78.3% 

83.3% 

5k 

78.3% 

73.3% 

81.7% 

78.3% 

81.7% 

83.3% 

85.0% 
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HMO-A  Percent  of  Groups  within  5%  of  Actual 

Stopfam 

"ievel  j|  A _GQ)   I  A  G(2)  1  CHRQ)  j  CHR(2)  j     ACG  _[  ADGJT)  j_ABG{2)_ 


One-part  model  using  actual  dollars   Groups  of  1500 


50k 

30.0% 

30.0% 

35.0% 

36.7% 

35.0% 

41.7% 

45.0% 

25k 

36.7% 

35.0% 

41.7% 

50.0% 

50.0% 

50.0% 

50.0% 

10k 

50.0% 

48.3% 

65.0% 

68.3% 

68.3% 

71.7% 

71.7% 

5k 

46.7% 

48.3% 

55.0% 

63.3% 

66.7% 

70.0% 

70.0% 

One-part  model  using  log  transformation  Groups  of  1500 


50k 

15.0% 

13.3% 

18.3% 

15.0% 

45.0% 

43.3% 

41.7% 

25k 

16.7% 

15.0% 

25.0% 

21.7% 

43.3% 

48.3% 

46.7% 

10k 

28.3% 

21.7% 

45.0% 

41.7% 

33.3% 

43.3% 

41.7% 

5k 

35.0% 

21.7% 

55.0% 

46.7% 

33.3% 

36.7% 

38.3% 

Two-part  model  using  log  transformation 

Groups  of  1500 

50k 

21.7% 

18.3% 

1.7% 

1.7% 

33.3% 

46.7% 

43.3% 

25k 

30.0% 

26.7% 

3.3% 

3.3% 

46.7% 

46.7% 

48.3% 

10k 

41.7% 

38.3% 

5.0% 

5.0% 

60.0% 

21.7% 

28.3% 

5k 

43.3% 

40.0% 

8.3% 

6.7% 

66.7% 

21.7% 

26.7% 

Four-part  mode!  using  log  transformation 

Groups  of  1500 

50k 

30.0% 

31.7% 

36.7% 

38.3% 

35.0% 

46.7% 

41.7% 

25k 

46.7% 

48.3% 

48.3% 

45.0% 

51.7% 

58.3% 

56.7% 

10k 

61.7% 

60.0% 

73.3% 

71.7% 

66.7% 

70.0% 

66.7% 

5k 

60.0% 

60.0% 

71.7% 

65.0% 

68.3% 

75.0% 

75.0% 

One-part  model  nsing  actual  dollars    Groups  of  500 


50k 

11.7% 

13.3% 

13.3% 

16.7% 

28.3% 

23.3% 

23.3% 

25k 

18.3% 

20.0% 

18.3% 

23.3% 

35.0% 

40.0% 

41.7% 

10k 

28.3% 

38.3% 

38.3% 

36.7% 

50.0% 

48.3% 

45.0% 

5k 

33.3% 

35.0% 

35.0% 

43.3% 

45.0% 

51.7% 

53.3% 

One-part  model  using  log  transformation 

Groups  of  500 

50k 

11.7% 

13.3% 

6.7% 

10.0% 

25.0% 

25.0% 

21.7% 

25k 

11.7% 

11.7% 

10.0% 

13.3% 

30.0% 

35.0% 

26.7% 

10k 

23.3% 

21.7% 

23.3% 

25.0% 

35.0% 

45.0% 

41.7% 

5k 

26.7% 

21.7% 

28.3% 

25.0% 

36.7% 

40.0% 

38.3% 

Two-part  model  usin] 

>  log  transformation 

Groups  of  500 

50k 

10.0% 

10.0% 

13.3% 

13.3% 

18.3% 

26.7% 

26.7% 

25k 

10.0% 

15.0% 

6.7% 

10.0% 

30.0% 

41.7% 

33.3% 

10k 

30.0% 

26.7% 

8.3% 

10.0% 

36.7% 

38.3% 

36.7% 

5k 

35.0% 

30.0% 

16.7% 

16.7% 

38.3% 

40.0% 

41.7% 

Four-part  model  using  log  transformation 

Groups  of  500 

50k 

15.0% 

18.3% 

18.3% 

20.0% 

26.7% 

21.7% 

25.0% 

25k 

18.3% 

21.7% 

20.0% 

28.3% 

38.3% 

38.3% 

33.3% 

10k 

38.3% 

41.7% 

41.7% 

40.0% 

45.0% 

51.7% 

53.3% 

5k 

41.7% 

40.0% 

45.0% 

38.3% 

43.3% 

55.0% 

50.0% 
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HMO-B  Percent  of  Groups  within  5%  of  Actual 

I'SllllG^ 


One-part  model  using  actual  dollars    Groups  of  5000 


50k 

31.7% 

36.7% 

38.3% 

46.7% 

93.3% 

28.3% 

31.7% 

25k 

13.3% 

16.7% 

16.7% 

18.3% 

63.3% 

11.7% 

13.3% 

10k 

15.0% 

18.3% 

15.0% 

20.0% 

58.3% 

11.7% 

15.0% 

5k 

23.3% 

26.7% 

26.7% 

30.0% 

"70/ 

66.7% 

23.3% 

28.3% 

One-part  model  using  log  transformation 

Groups  of  5000 

5Uk 

56.7% 

51.7% 

93.3% 

93.3% 

U.U% 

0.0% 

0.0% 

x5K 

33.3% 

26.7% 

91.7% 

91.7% 

A  AO/ 

U.Uyo 

0.0% 

0.0% 

1UK 

33.3% 

25.0% 

86.7% 

88.3% 

A  AO/ 

U.Uyo 

0.0% 

0.0% 

53.3% 

41.7% 

75.0% 

71.7% 

A  AO/. 

U.Uro 

0.0% 

0.0% 

Two-part  model  using  log  transformation 

Groups  of  5000 

50  k 

43.3% 

46.7% 

0.0% 

0.0% 

81.7% 

13.3% 

16.7% 

25k 

23.3% 

25.0% 

0.0% 

0.0% 

93.3% 

31.7% 

35.0% 

10k 

26.7% 

21.7% 

0.0% 

0.0% 

91.7% 

10.0% 

18.3% 

5k 

45.0% 

38.3% 

5.0% 

5.0% 

86.7% 

3.3% 

8.3% 

Four-part  model  using  log  transformation 

Groups  of  5000 

50k 

80.0% 

88.3% 

80.0% 

83.3% 

66.7% 

88.3% 

88.3% 

25k 

65.0% 

68.3% 

65.0% 

70.0% 

83.3% 

96.7% 

95.0% 

10k 

66.7% 

70.0% 

73.3% 

60.0% 

95.0% 

98.3% 

98.3% 

5k 

75.0% 

76.7% 

80.0% 

80.0% 

93.3% 

98.3% 

98.3% 

One-part  model  using  actual  dollars  Groups  of 3000 


50k 

30.0% 

33.3% 

31.7% 

38.3% 

58.3% 

26.7% 

31.7% 

25k 

21.7% 

20.0% 

21.7% 

23.3% 

45.0% 

13.3% 

18.3% 

10k 

16.7% 

20.0% 

21.7% 

25.0% 

41.7% 

16.7% 

15.0% 

5k 

26.7% 

30.0% 

28.3% 

31.7% 

58.3% 

30.0% 

30.0% 

One-part  model  using  log  transformation   Groups  of  3000 


50k 

38.3% 

36.7% 

76.7% 

73.3% 

3.3% 

0.0% 

0.0% 

25k 

31.7% 

26.7% 

73.3% 

75.0% 

6.7% 

0.0% 

0.0% 

10k 

33.3% 

23.3% 

83.3% 

81.7% 

0.0% 

0.0% 

0.0% 

5k 

45.0% 

35.0% 

68.3% 

66.7% 

0.0% 

0.0% 

0.0% 

Two-part  model  using  log  transformation 


Groups  of 3000 


50k 

35.0% 

31.7% 

3.3% 

3.3% 

75.0% 

33.3% 

36.7% 

25k 

26.7% 

26.7% 

3.3% 

3.3% 

73.3% 

43.3% 

48.3% 

10k 

28.3% 

23.3% 

3.3% 

3.3% 

83.3% 

33.3% 

33.3% 

5k 

40.0% 

38.3% 

6.7% 

6.7% 

86.7% 

23.3% 

26.7% 

Four-part  model  using  log  transformation  Groups  of  3000 


50k 

65.0% 

63.3% 

61.7% 

65.0% 

56.7% 

68.3% 

68.3% 

25k 

43.3% 

48.3% 

45.0% 

46.7% 

78.3% 

73.3% 

71.7% 

10k 

51.7% 

50.0% 

53.3% 

51.7% 

81.7% 

80.0% 

80.0% 

5k 

60.0% 

60.0% 

71.7% 

58.3% 

93.3% 

86.7% 

83.3% 
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HMO-B  Percent  of  Groups  within  5%  of  Actual 


1 

level 

i  A_G<i) 

A_G(2) 

CHR(2) 

ACG 

\  ADG(l)  1 

ADG(2) 

One-part  model  using  actual  dollars 

Groups  of  1500 

38.3% 

43.3% 

35.0% 

36.7% 

43.3% 

40.0% 

38.3% 

25k 

28.3% 

31.7% 

30.0% 

33.3% 

41.7% 

30.0% 

31.7% 

lUk 

25.0% 

30.0% 

30.0% 

30.0% 

51.7% 

28.3% 

28.3% 

Sir 

33.3% 

38.3% 

45.0% 

46.7% 

56.7% 

33.3% 

35.0% 

One-part  model  usin 

» log  transformation 

Groups  of  1500 

33.3% 

38.3% 

45.0% 

43.3% 

20.0% 

6.7% 

6.7% 

25k 

30.0% 

30.0% 

46.7% 

45.0% 

20.0% 

6.7% 

5.0% 

lUk 

35.0% 

33.3% 

55.0% 

51.7% 

13.3% 

3.3% 

3.3% 

Sir 

46.7% 

36.7% 

53.3% 

50.0% 

5.0% 

0.0% 

0.0% 

Two-part  model  using  log  transformation 

Groups  of  1500 

35.0% 

38.3% 

11.7% 

11.7% 

41.7% 

35.0% 

36.7% 

25k 

28.3% 

26.7% 

8.3% 

10.0% 

41.7% 

38.3% 

40.0% 

10k 

31.7% 

35.0% 

6.7% 

8.3% 

56.7% 

38.3% 

40.0% 

5k 

48.3% 

46.7% 

20.0% 

20.0% 

53.3% 

28.3% 

33.3% 

Four-part  model  using  log  transformation 

Groups  of  1500 

50k 

45.0% 

43.3% 

45.0% 

45.0% 

35.0% 

40.0% 

40.0% 

25k 

41.7% 

45.0% 

50.0% 

53.3% 

41.7% 

45.0% 

45.0% 

10k 

46.7% 

50.0% 

48.3% 

51.7% 

48.3% 

55.0% 

53.3% 

5k 

53.3% 

58.3% 

58.3% 

60.0% 

65.0% 

68.3% 

66.7% 

One-part  model  using  actual  dollars 


Groups  of  500 


50k 

35.0% 

31.7% 

30.0% 

31.7% 

25.0% 

30.0% 

25.0% 

25k 

36.7% 

36.7% 

33.3% 

36.7% 

28.3% 

30.0% 

31.7% 

10k 

38.3% 

40.0% 

38.3% 

43.3% 

28.3% 

35.0% 

33.3% 

5k 

33.3% 

36.7% 

38.3% 

41.7% 

35.0% 

36.7% 

38.3% 

One-part  model  using  log  transformation 

Groups  of  500 

50  k 

26.7% 

33.3% 

21.7% 

23.3% 

15.0% 

16.7% 

15.0% 

25k 

25.0% 

30.0% 

31.7% 

28.3% 

18.3% 

15.0% 

13.3% 

10k 

33.3% 

35.0% 

26.7% 

35.0% 

20.0% 

21.7% 

18.3% 

5k 

36.7% 

40.0% 

26.7% 

30.0% 

18.3% 

5.0% 

5.0% 

Two-part  model  using  log  transformation  Groups  of  500 


50k 

28.3% 

25.0% 

13.3% 

13.3% 

23.3% 

21.7% 

18.3% 

25k 

30.0% 

30.0% 

13.3% 

15.0% 

28.3% 

23.3% 

23.3% 

10k 

33.3% 

33.3% 

23.3% 

23.3% 

36.7% 

26.7% 

23.3% 

5k 

35.0% 

40.0% 

30.0% 

31.7% 

30.0% 

31.7% 

33.3% 

Four-part  model  using  log  transformation   Groups  of  500 


50k 

26.7% 

30.0% 

28.3% 

25.0% 

23.3% 

25.0% 

30.0% 

25k 

33.3% 

33.3% 

36.7% 

33.3% 

28.3% 

33.3% 

35.0% 

10k 

38.3% 

40.0% 

36.7% 

41.7% 

36.7% 

41.7% 

38.3% 

5k 

38.3% 

45.0% 

45.0% 

46.7% 

38.3% 

43.3% 

40.0% 

Statistical  Tests  of  Significance 
Paired  t-test  of  differences  in  means  on  Mean  Forecasting  Bias 
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One-part  model  using  actual  dollars 
HMO-A,  Groups  of  3,000 
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Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CHI  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  ADl  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*=  significant  at  .01  level;  **  =  significant  at  .0       ns  =  not  significant 


190 

Statistical  Tests  of  Significance 

McNemar  test  of  paired  proportions  on  Percent  of  Groups  within  5%  of  Actual 


One-part  model  using  actual  dollars 
HMO-A,  Groups  of  3,000 
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Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CHI  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  AD1  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*  =  significant  at  .01  level;  **  =  significant  at  .0       ns  =  not  significant 
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Four-part  model  using  the  log  dollars 
HMO-A,  Stoploss  at  $50,000 
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Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CHI  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  ADl  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*  =  significant  at  .0 1  level;  *  *  =  significant  at  .0       ns  =  not  significant 
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Four-part  model  using  the  log  dollars 
HMO-A,  Stoploss  at  $25,000 
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Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CHI  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  ADl  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*=  significant  at  .01  level;  **  =  significant  at  .0       ns  =  not  significant 
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Four-part  model  using  the  log  dollars 
HMO-A,  Stoploss  at  $  1 0,000 


At 
A2 
CHI 

cm 

ACG 
AD1 
AD2 


_  ;  

At     A2  CHI 


Groups  of  5,000 


CH2 


ACQ 


AD1 


AD2 


Groups  Of  3,000 


Al 


A2 


CHI 


CH2 


ACG 


ADJ 


AD2 


A2 


CH2 


ACG 


ADt 


AD2 


Grouos  of  1.500 

Groups  of  500 

■  At 

A2 

CHI 

CH2 

ACG 

AD1 

AD2 

i  At 

A2 

CHI 

CH2 

ACG 

ADt 

AD2 

At 

* 

** 

ns 

* 

* 

* 

A 

i 

ns 

ns 

ns 

* 

* 

* 

A2 

* 

** 

* 

* 

* 

A 

2 

ns 

ns 

* 

* 

* 

at 

CI 

it 

* 

* 

* 

* 

ns 

* 

* 

* 

CK 

t 

* 

* 

* 

CI 

j*; 

* 

liiii 

* 

Ad 
AD 

* 

** 

A< 

ft 

1111111 

ns 

ns 

AD 

1 

ns 

lilllll 

AI 
Al 

11 
f§ 

ns 

:.<:::.;;:::  >:.:.: 

Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CHI  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  AD1  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*  =  significant  at  .01  level;         **  =  significant  at  .0       ns  =  not  significant 
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Four-part  model  using  the  log  dollars 
HMO-A,  Stoploss  at  $5,000 
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Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CH 1  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  ADI  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*  =  significant  at  .01  level;  **  =  significant  at  .0       ns  =  not  significant 
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Statistical  Tests  of  Significance 

McNemar  test  of  paired  proportions  on  Percent  of  Groups  within  5%  of  Actual 


Four-part  model  using  the  log  dollars 
HMO- A,  Stoploss  at  $50,000 
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Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CHI  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  ADl  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*  =  significant  at  .0 1  level;  *  *  =  significant  at  .0       ns  =  not  significant 
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Statistical  Tests  of  Significance 

McNemar  test  of  paired  proportions  on  Percent  of  Groups  within  5%  of  Actual 


Four-part  model  using  the  log  dollars 
HMO-A,  Stoploss  at  $25,000 
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Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CHI  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  ADl  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*  =  significant  at  .01  level;         **  =  significant  at  .0       ns  =  not  significant 
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Statistical  Tests  of  Significance 

McNemar  test  of  paired  proportions  on  Percent  of  Groups  within  5%  of  Actual 


Four-part  model  using  the  log  dollars 
HMO-A,  Stoploss  at  $  1 0,000 
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Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CHI  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  ADl  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*=  significant  at  .0 1  level;  **  =  significant  at  .0       ns  =  not  significant 
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Statistical  Tests  of  Significance 

McNemar  test  of  paired  proportions  on  Percent  of  Groups  within  5%  of  Actual 


Four-part  model  using  the  log  dollars 
HMO-A,  Stoploss  at  $5,000 
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Legend:  Al  (age/gender  continuous);  A2  (age/gender  categorical);  CHI  (chronic  flag  continuous);  CH2  (chronic  flag  categorical); 

ACG  (Amb.  Care  Groups);  ADI  (Amb.  Diag.  Groups  continuous);  AD2  (Amb.  Diag.  Groups  categorical) 
*  =  significant  at  .0 1  level;  **  =  significant  at  .0       ns  =  not  significant 
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APPENDIX  C 
Summaries  for  Supplemental  Analysis  (No  Copayment) 


1  Preparation  Summaries  (By  Stoploss  Level) 

Mean  Expected  Values 

HMO-A   ;.   200 

HMO-B    201 

Mean  Expected  Values  Before  Smearing 

HMO-A    202 

HMO-B   ,   204 

2  Individual-level  Analysis  Summaries  (By  Stoploss  Level) 
Adjusted  R2  Values 

HMO-A    206 

HMO-B    207 

Mean  Absolute  Error,  validation  sample  (HMO-A  &  B)    208 

Standard  Deviation  of  Absolute  Error,  validation  sample  (HMO-A  &  B)    209 

Percent  of  Absolute  Error  within  $25,  validation  sample  (HMO-A  &  B)    210 

Percent  of  Absolute  Error  within  $50,  validation  sample  (HMO-A  &  B)    211 

Percent  of  Absolute  Error  more  than  $400,  validation  sample  (HMO-A  &  B)    212 

3  Group-level  Analysis  Summaries  (By  Stoploss  Level) 

Mean  Forecasting  Bias 

HMO-A    213 

HMO-B    215 

Mean  Squared  Forecasting  Error 

HMO-A   msm   217 

HMO-B    219 

Percent  of  Groups  with  5%  of  Target  (Actual) 

HMO-A    221 

HMO-B    223 


NOTE:  Figures  reflecting  actual  values,  expected  values,  and  absolute  error 
are  reported  in  dollars  ($)  per-member-per-month 
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level 


HMO-A 


Mean  Expected  Values 


(no  copay) 


A_G(1)   t  A_G<2)      CHR(1)  [  CHR(2)       ACC       ADG(l)  j  ADG{2) 


Estimation  Half 


50k 

75.01 

75.01 

75.01 

75.01 

75.01 

75.01 

75.01 

25k 

72.15 

72.15 

72.15 

72.15 

72.15 

72.15 

72.15 

10k 

65.07 

65.07 

65.07 

65.07 

65.07 

65.07 

65.07 

5k 

55.57 

55.57 

55.57 

55.57 

55.57 

55.57 

55.57 

One-part  model  using  log  transformation 

Estimation  Half 

50k 

70.18 

69.23 

70.27 

69.47 

79.59 

77.58 

77.32 

25k 

67.90 

66.91 

68.36 

67.58 

77.34 

75.27 

75.00 

10k 

62.59 

61.51 

63.46 

62.67 

71.25 

69.54 

69.24 

5k 

54.59 

53.65 

55.64 

55.00 

62.07 

60.79 

60.52 

Two-part  model  using  log  transformation 

Estimation  Half 

50k 

72.22 

71.44 

71.12 

70.41 

73.70 

73.88 

73.63 

25k 

69.83 

69.00 

69.05 

68.33 

71.32 

71.63 

71.37 

10k 

64.29 

63.35 

63.91 

63.17 

65.56 

66.17 

65.88 

5k 

56.03 

55.22 

55.94 

55.34 

57.09 

57.86 

57.61 

Four-part  mode!  using  log  transformation 

Estimation  Half 

50k 

76.34 

76.31 

75.34 

75.36 

75.16 

76.14 

75.71 

25k 

73.53 

73.37 

72.64 

72.54 

72.59 

73.32 

72.94 

10k 

66.42 

66.13 

65.73 

65.39 

65.77 

66.10 

65.88 

5k 

56.53 

56.24 

56.05 

55.66 

56.24 

56.38 

56.26 

One-part  model  using  raw  dollars  Validation  Hah" 


50k 

73.65 

73.47 

74.93 

74.85 

76.12 

76.22 

76.06 

25k 

70.84 

70.72 

72.01 

71.98 

73.26 

73.23 

73.11 

10k 

63.88 

63.87 

64.86 

64.91 

65.98 

65.83 

65.80 

5k 

54.66 

54.70 

55.43 

55.50 

56.29 

56.24 

56.24 

One-part  model  using  log  transformation 

Validation  Half 

50k 

09  44 

68.60 

70.79 

70.14 

80.73 

79.98 

80.04 

25k 

67.19 

66.31 

68.86 

68.23 

78.44 

77.60 

77.63 

10k 

61.94 

60.97 

63.92 

63.27 

72.22 

71.68 

71.63 

5k 

54.04 

53.21 

56.04 

55.52 

62.83 

62.64 

62.56 

Two-part  model  using  log  transformation 

Validation  ] 

fialf 

50k 

71.37 

70.68 

71.33 

70.77 

74.99 

75.63 

75.60 

25k 

69.01 

68.27 

69.25 

68.68 

72.56 

73.32 

73.27 

10k 

63.53 

62.69 

64.09 

63.48 

66.66 

67.71 

67.60 

5k 

55.39 

54.67 

56.10 

55.61 

57.98 

59.17 

59.07 

Four-part  model  using  log  transformation 

Validation  Half 

50k 

75.10 

75.02 

75.32 

75.39 

76.38 

77.67 

77.45 

25k 

72.35 

72.17 

72.61 

72.58 

73.77 

74.75 

74.55 

10k 

65.42 

65.14 

65.69 

65.39 

66.76 

67.25 

67.08 

5k 

55.80 

55.54 

56.07 

55.74 

57.01 

57.32 

57.22 
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llltll 


HMO-B  Mean  Expected  Values  (no  copay) 

AjG(l)   \   AG{2)  j  CHR<I)  [  OiR(2);[    ACG     j  ABG(i)  j -:AIK*<2)-: 


One-part  model  using  raw  dollars 


Estimation  Half 


50k 

93.19 

93.19 

93.19 

93.19 

93.19 

93.19 

93.19 

25k 

86.27 

86.27 

86.27 

86.27 

86.27 

86.27 

86.27 

10k 

74.86 

74.86 

74.86 

74.86 

74.86 

74.86 

74.86 

5k 

61.36 

61.36 

61.36 

61.36 

61.36 

61.36 

61.36 

One-part  model  using  iog  transformation 

Estimation  Half 

50k 

92.69 

90.94 

96.80 

95.64 

107.34 

118.71 

118.58 

25k 

87.03 

85.07 

92.11 

90.92 

101.92 

112.94 

112.80 

10k 

76.91 

74.93 

83.00 

81.87 

91.58 

101.18 

101.03 

5k 

64.22 

62.59 

69.97 

69.06 

77.34 

84.98 

84.83 

Two-part  model  usinj 

» log  transformation 

Estimation  Half 

50k 

92.03 

91.14 

91.06 

90.06 

95.01 

96.19 

95.78 

25k 

86.46 

85.37 

86.20 

85.16 

89.82 

91.50 

91.09 

10k 

76.54 

75.39 

77.14 

76.18 

80.25 

82.09 

81.73 

5k 

63.94 

62.99 

64.80 

64.06 

67.31 

69.09 

68.81 

Four-part  mode!  using  log  transformation 

Estimation  Half 

50k 

96.77 

96.86 

96.19 

96.22 

98.53 

99.19 

99.18 

25k 

90.29 

90.14 

90.01 

89.88 

91.92 

93.11 

93.07 

10k 

78.56 

78.13 

78.62 

78.17 

79.76 

81.05 

80.94 

5k 

64.00 

63.50 

64.18 

63.65 

64.86 

65.81 

65.64 

One-part  model  using  raw  dollars  Validation  Half 


50k 

81.73 

82.15 

82.50 

82.94 

85.67 

81.58 

81.78 

25k 

76.55 

76.87 

77.21 

77.54 

79.89 

76.86 

76.99 

10k 

67.57 

67.79 

68.07 

68.29 

69.85 

68.11 

68.19 

5k 

56.11 

56.28 

56.48 

56.64 

57.66 

56.60 

56.66 

One-part  model  using  iog  transformation 

Validation  Half 

50k 

83.25 

83.01 

87.54 

87.76 

99.79 

106.74 

107.35 

25k 

78.20 

77.70 

83.36 

83.48 

94.80 

101.75 

102.29 

10k 

69.23 

68.56 

75.24 

75.31 

85.24 

91.52 

91.96 

5k 

57.98 

57.44 

63.62 

63.71 

72.10 

77.15 

77.47 

Two-part  model  using  log  transformation 

Validation  1 

Half 

50k 

82.54 

82.69 

82.15 

82.03 

88.67 

86.28 

86.22 

25k 

77.60 

77.51 

77.83 

77.63 

83.86 

82.23 

82.14 

10k 

68.84 

68.59 

69.80 

69.58 

74.99 

74.08 

74.00 

5k 

57.70 

57.48 

58.84 

58.68 

63.00 

62.61 

62.55 

Four-part  model  using  log  transformation 

Validation  Half 

50k 

85.18 

85.56 

85.08 

85.35 

90.70 

88.16 

88.45 

25k 

79.92 

80.09 

80.05 

80.18 

84.89 

83.44 

83.61 

10k 

70.38 

70.32 

70.74 

70.44 

74.09 

73.69 

73.69 

5k 

58.05 

57.95 

58.46 

58.21 

60.68 

60.46 

60.38 
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(nocopay)     HMO- A         Mean  Expected  Values  (Before  Smearing) 

A„G{1)  [  A„G(2)  1  CHR(i)  [  CHR(2)  j    _ACG;  ;..l^BGj(l^: J  A1)C(2) 


Wii 


($) 


One-part  model  using  raw  dollars 


Estimation  Half 


50k 

75.01 

75.01 

75.01 

75.01 

75.01 

75.01 

75.01 

25k 

72.15 

72.15 

72.15 

72.15 

72.15 

72.15 

72.15 

10k 

65.07 

65.07 

65.07 

65.07 

65.07 

65.07 

65.07 

5k 

55.57 

55.57 

55.57 

55.57 

55.57 

55.57 

55.57 

One-part  model  using  log  transformation  Estimation  Half 


50k 

19.98 

19.59 

21.14 

20.92 

23.94 

25.30 

25.08 

25k 

19.95 

19.57 

21.10 

20.88 

23.89 

25.24 

25.02 

10k 

19.75 

19.40 

20.87 

20.67 

23.58 

24.85 

24.65 

5k 

19.26 

18.97 

20.30 

20.14 

22.79 

23.90 

23.74 

Two-part  model  using  log  transformation    Estimation  Half 


50k 

23.93 

23.60 

24.62 

24.43 

26.31 

27.62 

27.44 

25k 

23.89 

23.56 

24.57 

24.39 

26.26 

27.55 

27.38 

10k 

23.64 

23.35 

24.31 

24.15 

25.93 

27.15 

27.00 

5k 

23.05 

22.81 

23.65 

23.53 

25.09 

26.17 

26.05 

Four-part  model  using  log  transformation 

Estimation  Half 

50k 

44.36 

44.13 

44.36 

44.21 

44.92 

46.08 

45.75 

25k 

43.87 

43.64 

43.87 

43.72 

44.41 

45.53 

45.21 

10k 

41.20 

41.03 

41.25 

41.06 

41.86 

42.57 

42.42 

5k 

36.05 

35.92 

36.17 

35.95 

36.88 

37.32 

37.25 

Four-part  model  using  log  (ambulatory) 


Estimation  Half 


50k 

17.45 

17.32 

17.79 

17.72 

18.56 

18.93 

18.88 

25k 

17.45 

17.32 

17.79 

17.72 

18.56 

18.93 

18.88 

10k 

17.44 

17.31 

17.78 

17.70 

18.54 

18.92 

18.86 

5k 

17.36 

17.24 

17.70 

17.62 

18.43 

18.80 

18.75 

Four-part  model  using  log   (inpatient)  Estimation  Half 


50k 

26.91 

26.82 

26.57 

26.49 

26.36 

27.15 

26.87 

25k 

26.42 

26.32 

26.08 

26.00 

25.86 

26.60 

26.33 

10k 

23.76 

23.72 

23.47 

23.37 

23.32 

23.66 

23.55 

5k 

18.69 

18.68 

18.48 

18.33 

18.45 

18.51 

18.50 

I 
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(no  copay) 

HMO-A          Mean  Expected  Values           (Before  Smearing) 

level 

AJG{1)   [  A_G(2) 

:  CHR{l}J  CHR(2j_ 

ACG  ' 

ADG(1}  ] 

AD€(2} 

($) 

One-part  model  usinj 

I  raw  dollars 

Validation  Half 

50k 
25k 
10k 
5k 

73.65 

73.47 

74.93 

74.85 

76.12 

76.22 

76.06 

70.84 

70.72 

72.01 

71.98 

73.26 

73.23 

73.11 

63.88 

63.87 

64.86 

64.91 

65.98 

65.83 

65.80 

54.66 

54.70 

55.43 

55.50 

56.29 

56.24 

56.24 

One-part  model  using  log  transformation  ^ 

Validation  Half 

50k 

19.77 

19.42 

26.09 

25.96 

24.28 

21.30 

21.12 

25k 

19.74 

19.39 

26.02 

25.90 

24.23 

21.25 

21.08 

10k 

19.54 

19.23 

25.61 

25.50 

23.90 

21.01 

20.87 

5k 

19.07 

18.81 

24.62 

24.53 

23.06 

20.44 

20.33 

Two-part  model  usin 

>  log  transformation 

Validation  Half 

50k 

23.64 

23.35 

28.27 

28.18 

26.77 

24.69 

24.56 

25k 

23.61 

23.31 

28.20 

28.11 

26.72 

24.64 

24.52 

10k 

23.37 

23.11 

27.78 

27.70 

26.37 

24.37 

24.27 

5k 

22.79 

22.58 

26.76 

26.70 

25.47 

23.72 

23.64 

Four-part  model  using  log  transformation   Validation  Half 


50k 

43.58 

43.31 

47.02 

46.83 

45.70 

44.34 

44.20 

25k 

43.10 

42.84 

46.44 

46.24 

45.19 

43.84 

43.71 

10k 

40.49 

40.33 

43.32 

43.19 

42.55 

41.20 

41.01 

5k 

35.51 

35.39 

37.94 

37.88 

37.45 

36.16 

35.96 

Four-part  model  using  log   (ambulatory)  Validation  Half 


50k 

17.32 

17.21 

19.25 

19.22 

18.71 

17.84 

17.80 

25k 

17.32 

17.21 

19.25 

19.22 

18.71 

17.84 

17.80 

10k 

17.31 

17.20 

19.23 

19.21 

18.69 

17.83 

17.79 

5k 

17.24 

17.14 

19.11 

19.08 

18.57 

17.75 

17.71 

Four-part  model  using  log   (inpatient)  Validation  Half 


50k 

26.26 

26.10 

27.77 

27.60 

26.99 

26.50 

26.40 

25k 

25.77 

25.63 

27.19 

27.01 

26.48 

26.00 

25.90 

10k 

23.18 

23.12 

24.08 

23.98 

23.86 

23.37 

23.23 

5k 

18.28 

18.26 

18.83 

18.80 

18.88 

18.41 

18.26 
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(no  copay)     HMO-B  Mean  Expected  Values  (Before  Smearing) 

AjGji)  1  A„G(2)  1  CHR(1)  |  CHR{2)  j     ACG     {  ADG(l)  ]  AJ>G(2) 


($) 


50k 
25k 
10k 
5k 


One-part  model  using  raw  dollars 


Estimation  Half 


50k 

93.19 

93.19 

93.19 

93.19 

93.19 

93.19 

93.19 

25k 

86.27 

86.27 

86.27 

86.27 

86.27 

86.27 

86.27 

10k 

74.86 

74.86 

74.86 

74.86 

74.86 

74.86 

74.86 

5k 

61.36 

61.36 

61.36 

61.36 

61.36 

61.36 

61.36 

One-part  model  nsmg  fog  transformation 


Estimation  Half 


19.02 

18.38 

21.34 

21.06 

24.85 

28.96 

28.79 

18.95 

18.32 

21.24 

20.97 

24.69 

28.70 

28.54 

18.68 

18.08 

20.89 

20.63 

24.14 

27.86 

27.70 

18.05 

17.52 

20.09 

19.86 

22.99 

26.22 

26.08 

Two-part  model  using  log  transformation  Estimation  Half 


50k 

25.76 

25.38 

26.62 

26.46 

28.70 

30.85 

30.79 

25k 

25.64 

25.27 

26.49 

26.33 

28.52 

30.60 

30.54 

10k 

25.26 

24.91 

26.06 

25.90 

27.92 

29.80 

29.74 

5k 

24.35 

24.05 

25.07 

24.93 

26.64 

28.23 

28.17 

Four-part  model  using  log  transformation 

Estimation 

Half 

50k 

53.69 

53.58 

54.04 

54.02 

56.15 

57.46 

57.42 

25k 

52.30 

52.16 

52.65 

52.60 

54.42 

55.76 

55.72 

10k 

48.03 

47.80 

48.37 

48.15 

49.72 

50.82 

50.76 

5k 

40.31 

40.03 

40.66 

40.36 

41.57 

42.39 

42.28 

Four-part  model  using  log  (ambulatory) 

Estimation  Half 

50k 

17.09 

16.85 

17.47 

17.33 

18.16 

18.90 

18.84 

25k 

17.09 

16.85 

17.47 

17.33 

18.16 

18.90 

18.84 

10k 

17.08 

16.84 

17.46 

17.32 

18.14 

18.88 

18.82 

5k 

16.98 

16.75 

17.36 

17.22 

18.03 

18.75 

18.70 

Four-part  model  using  log  (inpatient) 


Estimation  Half 


50k 

36.60 

36.73 

36.57 

36.69 

37.99 

38.56 

38.59 

25k 

35.22 

35.32 

35.18 

35.27 

36.27 

36.86 

36.89 

10k 

30.95 

30.97 

30.91 

30.83 

31.58 

31.94 

31.94 

5k 

23.32 

23.28 

23.30 

23.14 

23.54 

23.64 

23.58 
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/ha  rnnaul 
\IKJ  K*\J\Ji3y  / 

HMO-B 

Mean  Expected  Values 

(Before  Smearing) 

A_C{1)   \  A_C(2)   |  CHR(11 

\  CHR(2)  j 

ACG 

ADG<11 

AJ>G{2) 

($) 

One-part  model  using  raw  dollar 

j 

Validation  Half 

50k 

81.73 

82.15 

82.50 

82.94 

85.67 

81.58 

81.78 

25k 

76.55 

76.87 

77.21 

77.54 

79.89 

76.86 

76.99 

10k 

67.57 

67.79 

68.07 

68.29 

69.85 

68.11 

68.19 

5k 

56.11 

56.28 

56.48 

56.64 

57.66 

56.60 

56.66 

One-part  model  using  log  transformation 

Validation  Half 

50k 
25k 
10k 
5k 

17.09 

16.78 

26.04 

26.07 

23.10 

19.30 

19.33 

17.03 

16.73 

25.87 

25.89 

22.97 

19.22 

19.25 

16.82 

16.56 

25.24 

25.25 

22.48 

18.95 

18.98 

16.32 

16.10 

23.88 

23.89 

21.45 

18.29 

18.34 

Two-part  model  using  log  transformation 

Validation  Half 

50k 

23.10 

23.03 

27.68 

27.72 

26.78 

24.02 

24.10 

25k 

23.02 

22.95 

27.52 

27.56 

26.63 

23.93 

24.01 

10k 

22.73 

22.67 

26.94 

26.97 

26.10 

23.60 

23.67 

5k 

22.02 

21.98 

25.66 

25.69 

24.95 

22.80 

22.87 

Four-part  model  using  log  transformation  Validation  Half 


50k 

46.94 

46.98 

50.85 

51.01  " 

51.50 

47.48 

47.58 

25k 

45.94 

45.95 

49.76 

49.87 

50.05 

46.47 

46.53 

10k 

42.68 

42.64 

46.09 

46.10 

45.97 

43.18 

42.97 

5k 

36.32 

36.25 

38.93 

38.87 

38.73 

36.79 

36.61 

Four-part  model  using  log   (ambulatory)  Validation  Half 


50k 

15.79 

15.71 

17.37 

17.33 

17.20 

16.23 

16.21 

25k 

15.79 

15.71 

17.37 

17.33 

17.20 

16.23 

16.21 

10k 

15.78 

15.71 

17.35 

17.32 

17.19 

16.22 

16.21 

5k 

15.71 

15.64 

17.25 

17.22 

17.09 

16.14 

16.14 

Four-part  model  using  log   (inpatient)  Validation  Half 


50k 

31.15 

31.27 

33.49 

33.67 

34.30 

31.25 

31.37 

25k 

30.15 

30.24 

32.40 

32.53 

32.85 

30.24 

30.33 

10k 

26.90 

26.93 

28.73 

28.78 

28.79 

26.96 

26.76 

5k 

20.61 

20.62 

21.68 

21.66 

21.65 

20.65 

20.47 
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HMO-A  Adj.  R-Square  (nocopay) 

:»«•>>»«*:•:•:•:•:•:•  xxxxxxxxxx-xxxxxx-  •:  :  ::v:i:::Xv::::^ 


Stupes 

X  X:X;:::|*^*f  X:X:XX: 

AJG(1) 

A_G(2) 

1x1111111x11 

CHR<2)  j 

SS*5*:Sx®*x':Sx!Sx 

ACG 

wmmmMmmmmm 

x  x  x . .  .-x-x-x-x  x  :•: .-.  xx . .  x-x-. . .  x  x-x  x-x  x  x  x 

;  ADG(l)  j  ADG<2) 

One-part  model  using  raw  dollars 

Estimation  Half 

50k 

0.035 

0.027 

0.049 

0.044 

0.061 

0.084 

0.080 

25k 

0.042 

0.034 

0.059 

0.054 

0.081 

0.106 

0.102 

10k 

0.045 

0.041 

0.067 

0.065 

0.105 

0.130 

0.128 

5k 

0.046 

0.044 

0.073 

0.073 

0.122 

0.150 

0.148 

One-part  model  using  log  transformation 

Estimation  Half 

50k 

0.027 

0.021 

0.045 

0.039 

0.051 

0.012 

0.007 

25k 

0.034 

0.026 

0.056 

0.049 

0.066 

0.016 

0.008 

10k 

0.032 

0.032 

0.061 

0.060 

0.079 

-0.012 

-0.011 

Si* 

OK 

0.028 

0.035 

0.062 

0.066 

0.086 

-0.038 

-0.029 

Two-part  model  us  in 

;  log  transformation 

Estimation  Half 

50k 

0.029 

0.022 

0.044 

0.039 

0.056 

0.057 

0.053 

25k 

0.036 

0.028 

0.055 

0.049 

0.073 

0.074 

0.069 

1UK 

0.036 

0.035 

0.063 

0.061 

0.093 

0.079 

0.079 

5k 

0.035 

0.038 

0.067 

0.069 

0.106 

0.080 

0.084 

Four-part  model  using  log  transformation 

Estimation  Half 

CAI 

50k 

0.033 

0.025 

0.050 

0.044 

0.060 

0.061 

0.069 

25k 

0.041 

0.032 

0.061 

0.055 

0.079 

0.084 

0.092 

1  Air 

luk 

0.042 

0.039 

0.068 

0.066 

0.103 

0.117 

0.121 

5k 

0.042 

0.044 

0.072 

0.073 

0.120 

0.137 

0.142 

One-part  model  using  raw  dollars 

Validation  Half 

50k 

0.031 

0.025 

0.042 

0.038 

0.044 

0.067 

0.065 

25k 

0.040 

0.034 

0.055 

0.051 

0.066 

0.090 

0.087 

10k 

0.049 

0.044 

0.071 

0.068 

0.094 

0.120 

0.118 

5k 

0.053 

0.049 

0.079 

0.078 

0.110 

0.140 

0.138 

One-part  model  us  in; 

» log  transformation 

Validation  Half 

50k 

0.021 

0.016 

0.034 

0.031 

0.033 

0.037 

0.033 

25k 

0.026 

0.023 

0.046 

0.042 

0.047 

0.042 

0.036 

10k 

0.031 

0.031 

0.058 

0.057 

0.058 

0.033 

0.028 

5k 

0.031 

0.033 

0.062 

0.063 

0.061 

0.016 

0.014 

Two-part  model  using  log  transformation  : 

Validation  Half 

50k 

0.023 

0.018 

0.034 

0.031 

0.037 

0.053 

0.050 

25k 

0.029 

0.025 

0.046 

0.043 

0.055 

0.071 

0.067 

10k 

0.036 

0.034 

0.060 

0.059 

0.075 

0.087 

0.084 

5k 

0.037 

0.037 

0.066 

0.066 

0.082 

0.090 

0.089 

Four-part  model  using  log  transformation 

Validation  Half 

50k 

0.030 

0.024 

0.043 

0.038 

0.044 

0.063 

0.063 

25k 

0.038 

0.033 

0.056 

0.052 

0.066 

0.082 

0.081 

10k 

0.047 

0.044 

0.072 

0.070 

0.094 

0.111 

0.111 

5k 

0.050 

0.050 

0.079 

0.078 

0.109 

0.129 

0.130 
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HMO-B  Adj.  R-Square  (nocopay) 


A_G(l) 

A_G<2) 

CHR(:1)l:;:: 

01R(2) 

ACG 

||||||||:||: 
AJ)G(l) 

A»G(2) 

w 

One-part  model  using  raw  dollars 

Estimation  Half 

50k 

0.031 

0.024 

0.046 

0.042 

0.065 

0.099 

0.096 

25k 

0.037 

0.030 

0.056 

0.053 

0.084 

0.119 

0.117 

10k 

0.043 

0.038 

0.067 

0.065 

0.107 

0.140 

0.139 

Sk 

0.051 

0.045 

0.078 

0.077 

0.127 

0.160 

0.159 

One-part  model  using  log  transformation 

Estimation  Half 

50k 

0.002 

0.012 

0.027 

0.028 

0.047 

-0.100 

-0.099 

25k 

-0.002 

0.014 

0.029 

0.032 

0.052 

-0.185 

-0.181 

10k 

-0.010 

0.018 

0.023 

0.033 

0.046 

-0.362 

-0.349 

5k 

-0.015 

0.020 

0.016 

0.031 

0.041 

-0.494 

-0.475 

Two-part  model  using  log  transformation   Estimation  Half 


50k 

0.020 

0.019 

0.039 

0.038 

0.060 

0.086 

0.078 

25k 

0.024 

0.023 

0.048 

0.048 

0.076 

0.096 

0.089 

10k 

0.027 

0.029 

0.057 

0.058 

0.091 

0.089 

0.084 

5k 

0.030 

0.034 

0.064 

0.067 

0.104 

0.088 

0.085 

Four-part  model  using  log  transformation 

Estimation  Half 

50k 

0.031 

0.019 

0.048 

0.039 

0.062 

0.099 

0.083 

25k 

0.036 

0.023 

0.059 

0.049 

0.079 

0.110 

0.096 

10k 

0.040 

0.031 

0.066 

0.062 

0.099 

0.112 

0.105 

5k 

0.043 

0.040 

0.074 

0.074 

0.118 

0.129 

0.127 

One-part  model  using  raw  dollars 

Validation  Half 

50k 

0.025 

0.024 

0.039 

0.039 

0.071 

0.070 

0.068 

25k 

0.032 

0.030 

0.049 

0.049 

0.086 

0.096 

0.094 

10k 

0.044 

0.043 

0.064 

0.065 

0.111 

0.130 

0.128 

5k 

0.054 

0.054 

0.078 

0.080 

0.132 

0.157 

0.156 

One-part  model  using  log  transformation 

Validation  Half 

50k 

-0.012 

0.014 

0.022 

0.025 

0.046 

-0.151 

-0.161 

25k 

-0.019 

0.017 

0.023 

0.030 

0.055 

-0.201 

-0.212 

10k 

-0.032 

0.023 

0.021 

0.035 

0.059 

-0.322 

-0.334 

5k 

-0.044 

0.027 

0.016 

0.036 

0.055 

-0.442 

-0.451 

Two-part  model  using  log  transformation 

Validation  Half 

50k 

0  020 

0.020 

0.038 

0.038 

0.066 

0.070 

0.069 

25k 

0.021 

0.024 

0.043 

0.045 

0.079 

0.078 

0.078 

10k 

0.024 

0.032 

0.053 

0.057 

0.099 

0.090 

0.091 

5k 

0.026 

0.039 

0.061 

0.069 

0.113 

0.096 

0.099 

Four-part  model  using  log  transformation 

Validation  Half 

50k 

0.030 

0.022 

0.046 

0.040 

0.069 

0.063 

0.053 

25k 

0.034 

0.028 

0.054 

0.049 

0.085 

0.083 

0.075 

10k 

0.043 

0.042 

0.068 

0.067 

0.112 

0.112 

0.107 

5k 

0.047 

0.055 

0.078 

0.083 

0.133 

0.142 

0.142 
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(Individual  level)  Mean  Absolute  Error  (no  copay) 


A_G<I) 

A_G(2) 

CHR(2) 

ACG 

ADG(l) 

§111111111 

M 

One-part  model  using  raw  dollars 

HMO-A 

50k 

95.29 

95.45 

94.54 

94.57 

92.40 

91.09 

91.16 

25k 

88.27 

88.55 

87.48 

87.63 

85.42 

84.04 

84.19 

10k 

75.72 

76.07 

74.88 

75.09 

72.92 

71.44 

71.63 

5k 

61.35 

61.63 

60.47 

60.61 

58.57 

57.27 

57.39 

One-part  model  using  log  transformation 

HMO-A 

50k 

93.71 

93.64 

92.54 

92.40 

95.86 

93.32 

93.50 

25k 

87.65 

87.55 

86.70 

86.57 

89.74 

87.19 

87.37 

10k 

76.96 

76.79 

76.24 

76.12 

78.41 

76.17 

76.32 

5k 

64.19 

64.05 

63.54 

63.47 

64.72 

63.26 

63.32 

Two-part  model  using  log  transformation 

HMO-A 

50k 

94.78 

94.78 

92.99 

92.86 

92.34 

90.56 

90.77 

25k 

88.64 

88.60 

87.05 

86.91 

86.20 

84.55 

84.75 

10k 

77.80 

77.67 

76.47 

76.31 

75.22 

73.86 

74.06 

5k 

64.83 

64.74 

63.69 

63.59 

62.19 

61.21 

61.32 

Four-part  model  using  log  transformation 

HMO-A 

50k 

95.87 

96.08 

94.27 

94.25 

92.32 

90.67 

90.63 

25k 

88.98 

89.12 

87.47 

87.40 

85.57 

83.84 

83.84 

10k 

76.50 

76.55 

75.10 

75.06 

73.23 

71.42 

71.47 

5k 

61.88 

61.92 

60.63 

60.56 

58.84 

57.31 

57.36 

One-part  model  using  raw  dollars 

HMO-B 

?OI- 

108.66 

108.56 

107.60 

107.35 

102.97 

102.54 

102.70 

Z5k 

101.81 

101.86 

100.59 

100.50 

96.49 

94.78 

95.08 

10k 

86.31 

86.57 

85.08 

85.21 

81.64 

79.35 

79.72 

5k 

66.74 

66.93 

65.64 

65.69 

62.73 

60.89 

61.08 

One-part  model  using  log  transformation 

HMO-B 

50k 

110.00 

109.82 

110.33 

110.37 

111.59 

115.80 

116.15 

25k 

103.77 

103.44 

104.65 

104.61 

105.35 

109.52 

109.86 

10k 

89.37 

89.00 

90.72 

90.67 

90.59 

94.56 

94.84 

5k 

71.09 

70.85 

72.18 

72.19 

71.44 

75.36 

75.57 

Two-part  model  using  log  transformation 

HMO-B 

50k 

109.07 

109.11 

106.69 

106.39 

104.87 

101.85 

101.72 

25k 

102.91 

102.84 

101.00 

100.65 

98.99 

96.42 

96.29 

10k 

88.68 

88.52 

87.32 

87.02 

85.14 

83.00 

82.85 

5k 

70.55 

70.40 

69.45 

69.22 

67.20 

65.61 

65.50 

Four-part  model  using  log  transformation 

HMO-B 

50k 

109.62 

109.96 

107.41 

107.48 

105.61 

101.83 

102.03 

25k 

103.01 

103.19 

101.04 

100.95 

98.88 

95.68 

95.77 

10k 

87.44 

87.42 

85.84 

85.61 

83.30 

80.58 

80.56 

5k 

67.49 

67.40 

66.19 

66.01 

63.64 

61.48 

61.37 
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(Individual  level)  Std.  Dev.  of  Absolute  Error  (no  copay) 


mmmmmm 
WmmmMii 

x-x-iw:-:-:-:-:-:-:-:-:.;-:-:-:-:-:-:-:-:-:-: 

I;-;. >>X-:  •>:•>:  •>1->X->X*X'X'X\ • 

:-:•:•>:■:•:•:■:■:•:• 
x.x  x  x-XxX^x-x-xxi  yyy 

llliiiilii 

Wy^yyWMyy^zM 

level 

($) 

AJXl) 

A_G{2) 

CHR(1) 

C11RC2) 

:  ACG  | 

ADG(l)  ) 

A»G{2) 

One-part  model  using  raw  dollars 

HMO-A 

50k 

245.68 

246.35 

244.34 

244.77 

244.15 

241.37 

241.63 

25k 

179.99 

180.46 

178.63 

178.90 

177.73 

175.73 

175.89 

10k 

116.04 

116.19 

114.66 

114.67 

113.41 

112.15 

112.18 

5k 

78.16 

78.16 

77.06 

76.99 

76.12 

75.15 

75.15 

One-part  model  using  log  transformation    HMO-A 


50k 

260.15 

260.73 

258.48 

259.04 

256.68 

257.24 

257.75 

25k 

197.25 

197.64 

195.31 

195.73 

193.17 

195.11 

195.63 

10k 

134.68 

134.81 

132.56 

132.71 

130.84 

134.62 

134.96 

5k 

98.56 

98.46 

96.69 

96.62 

95.64 

99.92 

100.04 

Two-part  model  using  log  transformation 

HMO-A 

50k 

259.47 

260.05 

258.32 

258.84 

257.30 

255.73 

256.10 

25k 

196.44 

196.89 

195.09 

195.50 

193.75 

192.71 

193.02 

10k 

133.78 

134.02 

132.25 

132.46 

131.15 

130.84 

130.99 

5k 

97.70 

97.75 

96.34 

96.37 

95.74 

95.86 

95.88 

Four-part  model  using  log  transformation 

::^l:^:"^^.'::':HMO-A:::^ 

50k 

245.63 

246.31 

244.34 

244.95 

244.19 

242.21 

242.17 

25k 

179.82 

180.34 

178.43 

178.89 

177.64 

176.69 

176.79 

10k 

115.70 

115.89 

114.39 

114.51 

113.19 

112.98 

112.90 

5k 

77.96 

77.88 

76.96 

77.02 

75.96 

75.88 

75.76 

One-part  model  using  raw  dollars 

HMO-B 

50k 

237.30 

237.50 

235.74 

235.82 

232.35 

232.89 

232.94 

25k 

196.80 

196.92 

195.23 

195.21 

191.75 

191.53 

191.50 

10k 

131.70 

131.66 

130.51 

130.35 

127.58 

127.24 

127.16 

5k 

83.25 

83.10 

82.38 

82.18 

80.32 

79.98 

79.93 

One-part  model  using  log  transformation 

HMO-B 

50k 

246.35 

242.56 

241.14 

240.65 

236.23 

262.85 

263.87 

25k 

208.64 

204.15 

202.73 

201.82 

197.44 

227.48 

228.56 

10k 

147.59 

142.34 

141.48 

140.06 

137.10 

170.44 

171.21 

5k 

101.84 

96.72 

96.61 

95.00 

93.75 

124.69 

125.05 

Two-part  model  using  log  transformation 

HMO-B 

50k 

242.16 

241.98 

240.47 

240.56 

236.25 

237.24 

237.32 

25k 

203.98 

203.55 

201.93 

201.87 

197.47 

199.08 

199.11 

10k 

142.50 

141.73 

140.33 

140.04 

136.40 

138.78 

138.70 

5k 

97.06 

96.16 

95.14 

94.68 

92.30 

94.88 

94.68 

Four-part  model  using  log  transformation 

HMO-B 

50k 

236.18 

237.07 

234.82 

235.58 

231.48 

234.17 

235.45 

25k 

195.86 

196.45 

194.30 

194.91 

190.66 

192.78 

193.71 

10k 

131.14 

131.17 

129.71 

129.82 

126.44 

128.30 

128.71 

5k 

83.17 

82.65 

81.93 

81.72 

79.52 

80.69 

80.70 
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(Individual  level)  Percent  of  Absolute  Error  Within  $25         (no  copay) 


Stopfcm 

A_G(1) 

A_G(2) 

[  CHR(1) 

CHR(2) 

One-part  model  using  raw  dollars 

22.9% 

22.7% 

29.5% 

28.0% 

zo.w  /o 

17  7% 

J / . /  /o 

2Sk 

23.6% 

23.6% 

29.9% 

28.3% 

77  8% 

18  7% 

JO.i  /o 

38  1% 

JO.  1  /O 

10k 

25.5% 

25.3% 

29.1% 

28.5% 

32.8% 

38.2% 

37.2% 

29.4% 

29.5% 

32.5% 

31.3% 

41  0% 

T"  J  .W  /U 

41.5% 

41.1% 

One-part  model  using  log  transformation 

HMO-A 

50k 

16.1% 

17.1% 

19.8% 

20.4% 

24.5% 

34.6% 

33.3% 

25k 

17.3% 

18.4% 

21.1% 

22.1% 

25.4% 

36.1% 

35.4% 

10k 

20.4% 

21.8% 

26.3% 

25.6% 

28.5% 

40.0% 

39.4% 

5k 

26.8% 

27.8% 

34.3% 

33.7% 

34.8% 

45.7% 

44.5% 

Two-part  mode!  using  log  transformation 

HMO-A 

15.1%        16.0%        17.3%        18.6%        26.5%        30.5%  29.1% 


16.5%  17.4%  18.5%  19.8%  28.1%  32.2%  31.9% 
19.3%  20.7%  21.8%  23.0%  33.0%  36.5%  35.6% 
25.4%        26.2%        30.2%        29.8%        41.1%        43.0%  41.9% 


Four-part  model  using  log  transformation  HMO-A 

19.5%        20.1%     |    22.4%        24.0%  23.8%  32.6%  f  31.0% 

20.4%        21.0%        23.3%        24.7%  24.9%  33.8%  32.3% 

23.0%        24.2%        25.3%        25.7%  34.3%  36.4%  34.5% 

28.0%        28.9%        30.3%        30.8%  41.6%  42.1%  40.9% 


50k 
25k 
10k 
5k 


One-part  model  using  raw  doDars   HMO-B 


50k 

21.5% 

18.9% 

24.1% 

24.9% 

23.3% 

37.1% 

37.4% 

25k 

21.2% 

19.5% 

23.8% 

24.7% 

23.7% 

36.5% 

37.9% 

10k 

21.9% 

21.6% 

25.3% 

24.8% 

34.2% 

36.9% 

37.1% 

5k 

27.0% 

26.1% 

29.7% 

29.0% 

38.8% 

40.9% 

41.7% 

One-part  model  using  log  transformation 

HMO-B 

50k 

12.7% 

13.6% 

18.6% 

20.3^) 

26.9% 

33.7% 

33.8% 

25k 

14.4% 

15.7% 

20.6% 

21.8% 

28.5% 

35.5% 

35.7% 

10k 

18.3% 

20.8% 

24.9% 

25.4% 

31.8% 

38.9% 

38.9% 

5k 

26.2% 

26.2% 

37.2% 

35.1% 

37.1% 

44.2% 

44.2% 

Two-part  model  using  log  transformation 

HMO-B 

50k 

11.4% 

12.4% 

14.2% 

15.8% 

27.5% 

30.1% 

29.2% 

25k 

12.9% 

14.1% 

16.1% 

18.5% 

29.0% 

32.0% 

32.5% 

10k 

16.6% 

17.7% 

20.2% 

22.0% 

31.7% 

36.5% 

36.5% 

5k 

23.7% 

25.1% 

28.4% 

28.8% 

38.1% 

42.9% 

43.5% 

Four-part  model  using  log  transformation 

HMO-B 

50k 

15.8% 

17.9% 

1 8.6% 

20.2% 

30.3% 

32.3% 

33.8% 

25k 

17.1% 

18.9% 

19.9% 

21.4% 

31.2% 

33.7% 

34.6% 

10k 

19.8% 

21.0% 

22.7% 

23.1% 

33.4% 

36.8% 

36.7% 

5k 

25.6% 

26.2% 

27.9% 

28.8% 

38.6% 

42.5% 

42.5% 

(Individual  level)  Percent  of  Absolute  Error  Within  $50        (no  copay) 


Mujpfogft  | 

level 

AJG(I)  : 

C11R(!) 

CHR(2> 

||||||§f|||§ 

ADG(l) 

ABG(2) 

 y<p\  

(3>) 

One-part  model  using  raw  dollars 

HMO-A 

50k 

54.1% 

45.7% 

54.1% 

54.0% 

_ — 
60.4% 

57.7% 

56.7% 

25k 

55.0% 

48.9% 

54.5% 

54.4% 

61.6% 

58.4% 

57.3% 

10k 

56.8% 

53.0% 

57.6% 

55.8% 

64.2% 

61.5% 

60.8% 

5k 

65.5% 

62.7% 

65.2% 

63.8% 

68.8% 

67.9% 

68.0% 

One-part  model  using  log  transformation 

HMO-A 

58  9% 

52.0% 

60.6% 

58.4% 

56.8% 

62.8% 

/CO  AO/ 

oZA/o 

61.3% 

54.5% 

61.4% 

60.8% 

58.1% 

64.0% 

oj.jyo 

10k 

65.5% 

59.1% 

63.6% 

62.9% 

61.9% 

66.4% 

65.9% 

5k 

71.2% 

68.2% 

67.5% 

67.9% 

66.9% 

69.7% 

69.5% 

Two-part  model  usin; 

I  log  transformation 

HMO-A 

311 K 

55.1% 

48.0% 

58.8% 

56.5% 

61.5% 

63.1% 

62.7% 

57.9% 

52.3% 

60.0% 

57.7% 

62.5% 

64.3% 

63.7% 

10k 

62.9% 

58.6% 

63.2% 

61.8% 

65.1% 

67.2% 

66.4% 

5k 

69.6% 

67.6% 

68.6% 

68.2% 

69.2% 

70.9% 

70.6% 

Four-part  model  using  log  transformation 

HMO-A 

50k 

52.7% 

48.9% 

56.3% 

53.7% 

60.5% 

62.9% 

61.5% 

25k 

54.8% 

51.7% 

57.6% 

54.9% 

61.3% 

64.0% 

63.6% 

10k 

59.1% 

57.2% 

61.1% 

59.9% 

64.2% 

66.9% 

66.2% 

5k 

67.0% 

63.9% 

67.8% 

67.5% 

68.7% 

71.5% 

71.5% 

One-part  model  using  raw  dollar? 

i 

HMO-B 

OUK 

37.7% 

37.9% 

50.1% 

47.7% 

57.2% 

55.6% 

56.4% 

ZjK 

40.9% 

40.4% 

51.9% 

51.2% 

58.1% 

56.9% 

57.8% 

10k 

48.9% 

45.1% 

54.2% 

52.8% 

61.0% 

60.2% 

60.5% 

5k 

61.4% 

59.9% 

62.9% 

61.6% 

67.0% 

66.6% 

66.4% 

One-part  model  using  log  transformation 

HMO-B 

5.0.  b 

44.3% 

41.1% 

53.5% 

51.6% 

50.6% 

57.4% 

57.0% 

47.9% 

44.0% 

58.0% 

53.6% 

52.7% 

58.9% 

58.6% 

10k 

55.4% 

53.5% 

60.8% 

57.6% 

56.1% 

61.7% 

61.4% 

CI, 

5K 

67.5% 

60.3% 

63.8% 

63.1% 

61.0% 

65.6% 

65.2% 

Two-part  model  using  log  transformation 

HMO-B 

JVK 

42.3% 

39.5% 

51.0% 

48.9% 

54.3% 

58.7% 

58.0% 

ZjK 

47.2% 

42.7% 

55.0% 

51.2% 

56.5% 

60.4% 

60.0% 

10k 

56.1% 

51.6% 

59.7% 

56.3% 

60.9% 

63.8% 

63.7% 

5k 

66.5% 

62.4% 

65.2% 

65.7% 

65.7% 

68.4% 

67.9% 

Four-part  model  using  log  transformation 

HMO-B 

50k 

42.9% 

39.6% 

48.7% 

50.1% 

54.8% 

59.3% 

59.8% 

25k 

45.7% 

41.8% 

51.3% 

51.5% 

56.5% 

60.6% 

61.0% 

10k 

52.7% 

48.3% 

56.1% 

55.0% 

59.9% 

64.0% 

63.7% 

5k 

62.6% 

62.1% 

64.1% 

63.4% 

65.7% 

68.7% 

68.4% 

(Individual  level) 


Percent  of  Absolute  Error  More  Than  $400   (no  copay) 


A_G(1) 

I  A_G(2) 

CHRC1) 

■     :  ACG 

ADG(l) 

A»G{2) 

VJ>/ 

One-part  model  asms  raw  dollars 

HMO-A 

50k 

3.2% 

3.2% 

3.2% 

3.1% 

2.9% 

2.8% 

2.8% 

25k 

3.3% 

3.2% 

3.2% 

3.2% 

2.9% 

2.8% 

2.8% 

10k 

3.4% 

3.3% 

3.3% 

3.3% 

3.1% 

2.9% 

2.9% 

5k 

0.0% 

0.1% 

0.0% 

0.1% 

0.0% 

0.0% 

0.0% 

One-part  model  using  log  transformation 

HMO-A 

50k 

3.4% 

3.4% 

3.2% 

3.3% 

3.3% 

3.4% 

3.5% 

25k 

3.4% 

3.4% 

3.3% 

3.3% 

3.2% 

3.4% 

3.4% 

10k 

3.5% 

3.5% 

3.3% 

3.3% 

3.0% 

3.3% 

3.3% 

5k 

1.2% 

1.2% 

1.0% 

1.0% 

1.1% 

1.4% 

1.3% 

1  wo-part  model  using  log  transformation  : 

HMO-A 

50k 

3.3% 

3.3% 

3.2% 

3.2% 

2.9% 

3.1% 

3.1% 

25k 

3.4% 

3.3% 

3.3% 

3.2% 

3.0% 

3.2% 

3.2% 

10k 

3.5% 

3.4% 

3.4% 

3.4% 

3.1% 

3.2% 

3.1% 

5k 

1.1% 

1.2% 

1.0% 

1.0% 

1.1% 

1.1% 

1.1% 

Four-part  model  using  log  transformation   HMO-A 


50k 

3.2% 

3.2% 

3.1% 

3.1% 

2.9% 

3.1% 

3.1% 

25k 

3.2% 

3.2% 

3.2% 

3.1% 

3.0% 

3.1% 

3.0% 

10k 

3.4% 

3.3% 

3.3% 

3.3% 

3.1% 

3.0% 

3.0% 

5k 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

One-part  model  using  raw  dollars   HMO-B 


50k 

4.4% 

4.4% 

4.3% 

4.1% 

3.8% 

4.3% 

4.3% 

25k 

4.4% 

4.4% 

4.3% 

4.3% 

4.0% 

4.1% 

4.1% 

10k 

4.5% 

4.5% 

4.5% 

4.4% 

4.2% 

4.2% 

4.1% 

5k 

0.0% 

0.1% 

0.1% 

0.1% 

0.0% 

0.1% 

0.1% 

One-part  model  using  log  transformation 

HMO-B 

50k 

4.6% 

4.4% 

4.3% 

4.2% 

5.3% 

6.0% 

6.0% 

25k 

4.6% 

4.4% 

4.2% 

4.2% 

5.3% 

5.8% 

5.8% 

10k 

4.6% 

4.5% 

4.4% 

4.2% 

4.6% 

5.3% 

5.4% 

5k 

1.7% 

1.6% 

1.6% 

1.6% 

1.3% 

2.6% 

2.7% 

Two-part  model  using  log  transformation 

HMO-B 

50k 

4.4% 

4.5% 

4.3% 

4.3% 

4.5% 

4.7% 

4.5% 

25k 

4.5% 

4.5% 

4.4% 

4.4% 

4.0% 

4.6% 

4.5% 

10k 

4.5% 

4.5% 

4.4% 

4.4% 

4.0% 

4.4% 

4.3% 

5k 

1.6% 

1.6% 

1.6% 

1.5% 

1.4% 

1.5% 

1.5% 

Four-part  model  using  log  transformation 

HMO-B 

50k 

4.3% 

4.4% 

4.1% 

4.2% 

4.3% 

4.6% 

4.6% 

25k 

4.3% 

4.4% 

4.2% 

4.2% 

3.9% 

4.5% 

4.4% 

10k 

4.4% 

4.5% 

4.4% 

4.4% 

4.1% 

4.1% 

4.1% 

5k 

0.0% 

0.0% 

0.0% 

0.0% 

0.0% 

0.1% 

0.0% 
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HMO- A  Mean  Forecasting  Bias  (nocopay) 


Stop  km 
level  .  |L  A  G(l) 

i  A  Gil)   1  CHR(I) 

" 'acg"  \ 

ADG(I 

) 

ADG<2) 

One-part  model  using  raw  doUars   Groups  of  5000 


50k 

-6.8% 

-7.1% 

-5.1% 

-5.3% 

-3.6% 

-3.6% 

-3.9% 

25k 

-4.3% 

-4.5% 

-2.7% 

-2.8% 

-1.0% 

-1.1% 

-1.3% 

10k 

-2.9% 

-2.9% 

-1.3% 

-1.2% 

0.5% 

0.1% 

0.1% 

5k 

-3.1% 

-3.0% 

-1.7% 

-1.5% 

-0.1% 

-0.2% 

-0.3% 

One-part  model  using  tog  transformation 

Groups  of  5000 

50k 

-12.0% 

-13.0% 

-17.2% 

-17.4% 

2.3% 

9.7% 

9.2% 

25k 

-9.1% 

-10.2% 

-14.3% 

-14.5% 

6.1% 

13.9% 

13.2% 

10k 

-5.8% 

-7.2% 

-10.7% 

-10.9% 

9.7% 

18.3% 

17.4% 

5k 

-4.7% 

-6.0% 

-8.6% 

-8.8% 

10.6% 

18.8% 

18.0% 

Two-part  model  using  log  transformation 

Groups  of  5000 

50k 

-9.6% 

-10.4% 

-16.3% 

-16.5% 

-4.9% 

3.4% 

2.8% 

25k 

-6.7% 

-7.6% 

-13.4% 

-13.6% 

-1.8% 

7.0% 

6.4% 

10k 

-3.4% 

-4.6% 

-9.8% 

-10.0% 

1.4% 

10.9% 

10.0% 

5k 

-2.3% 

-3.5% 

-7.8% 

-8.0% 

2.1% 

11.4% 

10.6% 

Four-part  model  using  log  transformation 

Groups  of  5000 

50k 

-5.0% 

-5.1% 

-4.7% 

-4.6% 

-3.2% 

-1.9% 

-2.1% 

25k 

-2.3% 

-2.5% 

-1.9% 

-1.9% 

-0.2% 

0.8% 

0.6% 

10k 

-0.5% 

-0.9% 

-0.0% 

-0.5% 

1.7% 

2.3% 

2.0% 

5k 

-1.0% 

-1.5% 

-0.5% 

-1.1% 

1.3% 

1.6% 

1.5% 

One-part  model  using  raw  dollars  Groups  of  3000 


50k 

-7.9% 

-8.2% 

-6.5% 

-6.7% 

-5.0% 

-5.0% 

-5.2% 

25k 

-5.1% 

-5.3% 

-3.7% 

-3.8% 

-1.9% 

-2.1% 

-2.3% 

10k 

-2.9% 

-2.9% 

-1.5% 

-1.5% 

0.2% 

-0.1% 

-0.2% 

5k 

-2.9% 

-2.9% 

-1.7% 

-1.6% 

-0.2% 

-0.3% 

-0.4% 

One-part  model  using  tog  transformation 

Groups  of  3000 

50k 

-13  3% 

-14.3% 

-18.6% 

-18.8% 

0.7% 

7.8% 

7.1% 

25k 

-10.1% 

-11.2% 

-15.3% 

-15.6% 

4.9% 

12.3% 

11.6% 

10k 

-6.1% 

-7.6% 

-11.0% 

-11.4% 

9.3% 

17.5% 

16.6% 

5k 

-4.7% 

-6.1% 

-8.8% 

-9.1% 

10.4% 

18.4% 

17.5% 

Two-part  model  using  log  transformation 

Groups  of  3000 

50k 

-10.8% 

-11.7% 

-17.6% 

-17.8% 

-6.4% 

1.6% 

1.0% 

25k 

-7.6% 

-8.6% 

-14.3% 

-14.6% 

-3.0% 

5.7% 

5.0% 

10k 

-3.6% 

-4.9% 

-10.1% 

-10.4% 

0.9% 

10.4% 

9.4% 

5k 

-2.3% 

-3.6% 

-7.9% 

-8.1% 

1.9% 

11.1% 

10.3% 

Four-part  model  using  log  transformation 

Groups  of  3000 

50k 

-6.1% 

-6.3% 

-5.9% 

-5.9% 

-4.7% 

-3.3% 

-3.6% 

25k 

-3.0% 

-3.4% 

-2.8% 

-2.9% 

-1.3% 

-0.2% 

-0.5% 

10k 

-0.6% 

-1.0% 

-0.2% 

-0.7% 

1.3% 

1.9% 

1.6% 

5k 

-0.9% 

-1.4% 

-0.5% 

-1.1% 

1.1% 

1.5% 

1.3% 

I  I 
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HMO- A  Mean  Forecasting  Bias  (no copay) 


level  llj  A  G(I) 

A  G(2)   |  CHR(i)  |  CHR(2) 

ACG 

ADGfl)  |  ADG{2> 

($) 

One-part  model  usinj 

» raw  dollars 

Groups  of  1500 

SOk 
25k 
10k 
5k 

-7.0% 

-7.1% 

-5.3% 

-5.3% 

-3.3% 

-3.4% 

-3.5% 

-4.3% 

-4.4% 

-2.7% 

-2.6% 

-0.5% 

-0.7% 

-0.8% 

-2.8% 

-2.7% 

-1.3% 

-1.1% 

0.8% 

0.6% 

0.6% 

-3.2% 

-3.1% 

-1.8% 

-1.6% 

-0.0% 

0.0% 

0.0% 

One-part  model  using  log  transformation    Groups  of  1500 


SOk 

-12.3% 

-13.3% 

-17.5% 

-17.6% 

2.5% 

10.7% 

10.1% 

25k 

-9.4% 

-10.4% 

-14.4% 

-14.6% 

6.5% 

15.0% 

14.4% 

10k 

-6.1% 

-7.4% 

-10.8% 

-11.0% 

10.0% 

19.4% 

18.5% 

5k 

-5.1% 

-6.4% 

-8.9% 

-9.0% 

10.7% 

19.7% 

18.9% 

Two-part  model  using  log  transformation 

Groups  of  1500 

50k 

-9.9% 

-10.6% 

-16.5% 

-16.6% 

-4.7% 

4.0% 

3.5% 

25k 

-6.9% 

-7.7% 

-13.5% 

-13.6% 

-1.5% 

7.8% 

7.2% 

10k 

-3.6% 

-4.8% 

-9.9% 

-10.1% 

1.7% 

11.7% 

10.8% 

5k 

-2.7% 

-3.8% 

-8.0% 

-8.2% 

2.2% 

12.0% 

11.2% 

Four-part  model  using  log  transformation 

Groups  of  1500 

50k 

-5.1% 

-5.1% 

-4.8% 

-4.5% 

-2.9% 

-1.4% 

-1.5% 

25k 

-2.3% 

-2.5% 

-1.9% 

-1.8% 

0.2% 

1.4% 

1.3% 

10k 

-0.5% 

-0.9% 

-0.0% 

-0.4% 

2.0% 

2.8% 

2.6% 

5k 

-1.2% 

-1.6% 

-0.7% 

-1.2% 

1.3% 

2.0% 

1.8% 

One-part  model  nsing  raw  dollars   Groups  of  500 


50k 

-7.5% 

-7.4% 

-6.0% 

-5.8% 

-3.6% 

-3.6% 

-3.5% 

25k 

-5.0% 

-4.8% 

-3.6% 

-3.3% 

-1.0% 

-1.2% 

-1.1% 

10k 

-3.2% 

-2.9% 

-1.9% 

-1.5% 

0.6% 

0.2% 

0.3% 

5k 

-2.9% 

-2.6% 

-1.6% 

-1.3% 

0.5% 

0.4% 

0.5% 

One-part 

model  using 

;  log  transfo 

rmation 

Groups  of  500 

50k 

-12.3% 

-12.9% 

-17.6% 

-17.5% 

1.9% 

9.6% 

9.3% 

25k 

-9.5% 

-10.2% 

-14.7% 

-14.6% 

5.6% 

13.6% 

13.3% 

10k 

-6.1% 

-7.1% 

-11.1% 

-10.9% 

9.3% 

18.0% 

17.5% 

5k 

-4.5% 

-5.4% 

-8.5% 

-8.3% 

10.5% 

18.9% 

18.4% 

Two-part  model  using  log  transformation 

Groups  of  500 

50k 

-9.9% 

-10.3% 

-16.6% 

-16.5% 

-4.9% 

3.4% 

3.2% 

25k 

-7.1% 

-7.5% 

-13.8% 

-13.6% 

-1.9% 

7.0% 

6.7% 

10k 

-3.7% 

-4.5% 

-10.1% 

-10.0% 

1.3% 

11.0% 

10.4% 

5k 

-2.2% 

-3.0% 

-7.8% 

-7.6% 

2.5% 

11.9% 

11.4% 

Four-part  model  using  log  transformation 

Groups  of  500 

50k 

-5.7% 

-5.3% 

-5.4% 

-4.9% 

-3.1% 

-1.9% 

-1.8% 

25k 

-3.0% 

-2.8% 

-2.7% 

-2.3% 

-0.2% 

0.8% 

0.9% 

10k 

-0.9% 

-0.9% 

-0.5% 

-0.7% 

1.8% 

2.3% 

2.4% 

5k 

-0.9% 

-1.0% 

-0.4% 

-0.8% 

1.8% 

2.2% 

2.3% 
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HMO-B  Mean  Forecasting  Bias  (nocopay) 

_  leveL  J|  AGO)   I  A;.G(2)   j  CHR{1)  I  CHR&)  I     ACC     |  APGffi  j  ADG{2) 


One-part  model  using  raw  dollars 


Groups  of  5000 


50k 

-6.2% 

-5.6% 

-5.5% 

-4.9% 

-1.8% 

-6.7% 

-6.4% 

25k 

-8.5% 

-8.0% 

-7.8% 

-7.4% 

-4.6% 

-8.4% 

-8.2% 

10k 

-8.3% 

-7.9% 

-7.7% 

-7.4% 

-5.3% 

-7.8% 

-7.6% 

5k 

-7.0% 

-6.7% 

-6.5% 

-6.2% 

-4.6% 

-6.4% 

-6.3% 

One-part  model  using  log  transformation 

Groups  of  5000 

50k 

-4.5% 

-4.7% 

-9.4% 

-8.8% 

i  a  in/ 

14.3% 

34.8% 

35.1% 

25k 

-6.6% 

-7.2% 

-9.9% 

-9.4% 

12.9% 

33.4% 

33.4% 

-g 

10k 

-6.3% 

-7.2% 

-7.6% 

-7.0% 

14.8% 

34.9% 

34.8% 

Sir 

-4.8% 

-5.6% 

-4.0% 

-3.3% 

1  "7  AO/ 

1  /  A/o 

35.5% 

35.4% 

Two-part  model  using  log  transformation 

Groups  of  5000 

50k 

-5.3% 

-5.0% 

-14.2% 

-14.1% 

1.6% 

8.2% 

7.9% 

25k 

-7.3% 

-7.3% 

-14.8% 

-14.6% 

-0.0% 

6.5% 

6.0% 

10k 

-6.9% 

-7.2% 

-12.6% 

-12.5% 

1.0% 

7.4% 

6.8% 

5k 

-5.3% 

-5.6% 

-9.4% 

-9.2% 

2.7% 

8.2% 

7.7% 

Four-part  model  using  log  transformation 

Groups  of  5000 

50k 

-2.2% 

-1.7% 

-2.4% 

-2.1% 

4.0% 

0.9% 

1.2% 

25k 

-2.3% 

-2.5% 

-1.9% 

-1.9% 

-0.2% 

0.8% 

0.6% 

10k 

-4.5% 

-4.5% 

-4.1% 

-4.5% 

0.4% 

-0.3% 

-0.3% 

5k 

-3.8% 

-4.0% 

-3.3% 

-3.7% 

0.4% 

-0.1% 

-0.2% 

One-part  model  using  raw  dollars    Groups  of  3000 


50k 

-6.6% 

-6.0% 

-5.7% 

-5.1% 

-2.3% 

-6.9% 

-6.6% 

25k 

-8.9% 

-8.5% 

-8.2% 

-7.7% 

-5.2% 

-8.6% 

-8.4% 

10k 

-8.8% 

-8.5% 

-8.2% 

-7.8% 

-6.0% 

-8.1% 

-8.0% 

5k 

-7.6% 

-7.3% 

-7.0% 

-6.7% 

-5.3% 

-6.8% 

-6.7% 

One-part  model  using  log  transformation 

Groups  of  3000 

50k 

-4.9% 

-5.4% 

-9.7% 

-9.2% 

13.6% 

34.5% 

34.7% 

25k 

-7.0% 

-7.8% 

-10.3% 

-9.8% 

12.2% 

32.9% 

32.9% 

10k 

-6.9% 

-7.9% 

-8.0% 

-7.4% 

14.0% 

34.3% 

34.2% 

5k 

-5.3% 

-6.3% 

-4.4% 

-3.8% 

16.6% 

35.0% 

34.8% 

Two-part  model  using  log  transformation 

Groups  of  3000 

50k 

-5.8% 

-5.6% 

-14.6% 

-14.5% 

0.9% 

7.9% 

7.5% 

25k 

-7.8% 

-8.0% 

-15.2% 

-15.1% 

-0.7% 

6.2% 

5.7% 

10k 

-7.4% 

-7.8% 

-13.0% 

-13.0% 

0.3% 

6.9% 

6.4% 

5k 

-5.9% 

-6.3% 

-9.8% 

-9.7% 

1.9% 

7.8% 

7.3% 

Four-part  model  using  log  transformation 

Groups  of  3000 

50  k 

-2.5% 

-2.1% 

-2.7% 

-2.3% 

3.4% 

0.7% 

1.0% 

25k 

-3.0% 

-3.4% 

-2.8% 

-2.9% 

-1.3% 

-0.2% 

-0.5% 

10k 

-5.0% 

-5.0% 

-4.5% 

-5.0% 

-0.3% 

-0.7% 

-0.7% 

5k 

-4.4% 

-4.5% 

-3.7% 

-4.2% 

-0.4% 

-0.5% 

-0.7% 

HMO-B 


Mean  Forecasting  Bias 


(no  copay) 


Stopbws 
level  j 

A  GO) 

AjG(2j 

€HR{1) 

CHR(2) 

ACG 

ADG(l) 

ADG(2) 

($) 

One-part  model  using  raw  dollars 

Groups  of  1500 

50k 

-6.5% 

-5.9% 

-6.2% 

-5.6% 

-2.5% 

-7.9% 

-7.5% 

25k 

-8.5% 

-8.0% 

-8.2% 

-7.7% 

-5.0% 

-9.1% 

-8.8% 

10k 

-8.2% 

-7.8% 

-7.9% 

-7.6% 

-5.7% 

-8.3% 

-8.1% 

5k 

-6.3% 

-6.0% 

-6.1% 

-5.8% 

-4.3% 

-6.2% 

-6.1% 

One-part  model  using  log  transformation 


Groups  of  1500 


50k 

-4.7% 

-4.7% 

-10.0% 

-9.3% 

13.1% 

32.2% 

32.4% 

25k 

-6.6% 

-6.9% 

-10.3% 

-9.6% 

12.0% 

31.0% 

31.1% 

10k 

-6.5% 

-7.1% 

-8.0% 

-7.3% 

13.8% 

32.5% 

32.5% 

5k 

-4.4% 

-4.9% 

-3.9% 

-3.1% 

17.1% 

34.1% 

34.0% 

Two-part  model  using  log  transformation 


Groups  of  1500 


-5.4% 

-5.0% 

-14.6% 

-14.4% 

0.8% 

7.1% 

6.7% 

-7.2% 

-7.2% 

-14.9% 

-14.8% 

-0.6% 

5.6% 

5.2% 

-6.9% 

-7.1% 

-12.8% 

-12.7% 

0.4% 

6.5% 

5.9% 

-4.7% 

-5.0% 

-9.1% 

-8.9% 

2.6% 

8.0% 

7.5% 

50k 
25k 
10k 
5k 


Four-part  model  using  log  transformation 


Groups  of  1500 


50k 

-2.4% 

-1.9% 

-2.9% 

-2.6% 

3.1% 

-0.0% 

0.3% 

25k 

-2.3% 

-2.5% 

-1.9% 

-1.8% 

0.2% 

1.4% 

1.3% 

10k 

-4.4% 

-4.4% 

-4.3% 

-4.7% 

-0.2% 

-0.7% 

-0.8% 

5k 

-3.2% 

-3.2% 

-2.8% 

-3.2% 

0.5% 

0.1% 

-0.0% 

One-part  model  using  raw  dollars 


Groups  of  500 


50k 

-6.4% 

-6.5% 

-4.5% 

-4.7% 

-2.7% 

-6.9% 

-7.0% 

25k 

-8.6% 

-8.7% 

-7.1% 

-7.2% 

-5.4% 

-8.9% 

-9.0% 

10k 

-7.7% 

-7.7% 

-6.4% 

-6.5% 

-5.2% 

-7.8% 

-7.9% 

5k 

-6.2% 

-6.2% 

-5.1% 

-5.1% 

-4.1% 

-6.3% 

-6.3% 

One-part  model  using  log  transformation 

Groups  of  500 

50k 

-3.2% 

-3.0% 

-7.3% 

-6.4% 

14.1% 

33.6% 

34.2% 

25k 

-5.3% 

-5.5% 

-7.9% 

-7.1% 

12.6% 

31.9% 

32.3% 

10k 

-4.4% 

-4.9% 

-4.9% 

-4.0% 

15.1% 

34.1% 

34.4% 

5k 

-2.9% 

-3.3% 

-1.2% 

-0.3% 

17.8% 

34.8% 

35.0% 

Two-part  model  using  log  transformation 

Groups  of  500 

50k 

-4.4% 

-4.2% 

-12.8% 

-12.8% 

1.6% 

7.8% 

7.4% 

25k 

-6.4% 

-6.5% 

-13.4% 

-13.4% 

-0.2% 

6.0% 

5.5% 

10k 

-5.3% 

-5.7% 

-10.6% 

-10.6% 

1.5% 

7.5% 

6.9% 

5k 

-3.7% 

-4.1% 

-7.4% 

-7.3% 

3.2% 

8.4% 

7.8% 

Four-part  model  using  log  transformation 

Groups  of  500 

50k 

-2.5% 

-2.4% 

-2.1% 

-2.2% 

2.9% 

-0.9% 

-0.7% 

25k 

-3.0% 

-2.8% 

-2.7% 

-2.3% 

-0.2% 

0.8% 

0.9% 

10k 

-3.6% 

-3.9% 

-2.8% 

-3.2% 

0.4% 

-1.0% 

-1.0% 

5k 

-2.7% 

-3.1% 

-1.7% 

-2.1% 

0.7% 

-0.4% 

-0.5% 

HMO-A      Mean  Squared  Forecasting  Error  (nocopay) 

...  A_G(1)      A  G(2)   :  CHR(l)      CHR(2)        ACG  ADG(l)  \  APC(2) 


One-part  model  using  raw  doDars  Groups  of  5000 


50k 

0.6% 

0.6% 

0.4% 

0.4% 

0.3% 

0.3% 

0.3% 

25k 

0.3% 

0.3% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

10k 

0.2% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

5k 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

One-part  model  using  log  transformation   Groups  of  5000 


50k 
25k 
10k 
5k 

1.6% 

1.8% 

3.1% 

3.2% 

0.2% 

1.2% 

1.1% 

0.9% 

1.2% 

2.1% 

2.2% 

0.5% 

2.2% 

2.0% 

0.4% 

0.6% 

1.2% 

1.3% 

1.1% 

3.5% 

3.2% 

0.3% 

0.4% 

0.8% 

0.8% 

1.2% 

3.7% 

3.4% 

Two-part  model  using  log  transformation 

Groups  of  5000 

50k 
25k 
10k 
5k 

1.1% 

1.2% 

2.8% 

2.8% 

0.4% 

0.3% 

0.3% 

0.6% 

0.7% 

1.9% 

1.9% 

0.2% 

0.7% 

0.6% 

0.2% 

0.3% 

1.0% 

1.1% 

0.1% 

1.3% 

1.1% 

0.1% 

0.2% 

0.7% 

0.7% 

0.1% 

1.4% 

1.2% 

Four-part  model  using  log  transformation 

Groups  of  5000 

50k 
25k 
10k 
5k 

0.4% 

0.4% 

0.3% 

0.3% 

0.3% 

0.2% 

0.2% 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

One-part  model  using  raw  dollars    Groups  of  3000 


50k 

0.8% 

0.9% 

0.6% 

0.6% 

0.5% 

0.5% 

0.5% 

25k 

0.4% 

0.5% 

0.3% 

0.3% 

0.2% 

0.2% 

0.2% 

10k 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

5k 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

One-part 

model  usinf 

\  log  transfo 

rmation 

Groups  of  3000 

50k 

2.0% 

2.3% 

3.6% 

3.7% 

0.3% 

1.0% 

0.9% 

25k 

1.2% 

1.4% 

2.5% 

2.5% 

0.5% 

1.9% 

1.7% 

10k 

0.5% 

0.7% 

1.3% 

1.4% 

1.1% 

3.4% 

3.1% 

5k 

0.3% 

0.5% 

0.9% 

0.9% 

1.3% 

3.6% 

3.3% 

Two-part  model  using  log  transformation 

Groups  of  3000 

50k 

1.4% 

1.6% 

3.2% 

3.3% 

0.7% 

0.3% 

0.3% 

25k 

0.7% 

0.9% 

2.2% 

2.3% 

0.3% 

0.6% 

0.5% 

10k 

0.3% 

0.4% 

1.1% 

1.2% 

0.2% 

1.3% 

1.1% 

5k 

0.2% 

0.2% 

0.7% 

0.7% 

0.2% 

1.4% 

1.2% 

Four-part  model  using  log  transformation 

Groups  of  3000 

50k 

0.6% 

0.6% 

0.5% 

0.5% 

0.4% 

0.3% 

0.3% 

25k 

0.3% 

0.3% 

0.2% 

0.2% 

0.2% 

0.2% 

0.2% 

10k 

0.1% 

0.1% 

0.1% 

0.1% 

0.2% 

0.2% 

0.2% 

5k 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 
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Stopkm 

f  A_G(2]  }  CHR(1)  T "CHjS3l"  ACG 1  ADG(I)  1  ADG<2) 

($)  — 


One-part  model  using  raw  dollars    Groups  of  1500 


50k 

1.1% 

1.1% 

0.9% 

0.9% 

0.9% 

0.7% 

0.7% 

25k 

0.6% 

0.7% 

0.5% 

0.5% 

0.6% 

0.5% 

0.5% 

10k 

0.4% 

0.4% 

0.3% 

0.3% 

0.4% 

0.3% 

0.3% 

5k 

0.3% 

0.3% 

0.3% 

0.2% 

0.3% 

0.3% 

0.3% 

One-part  model  using  log  transformation 

Groups  of  1500 

5Uk 

2.1% 

2.4% 

3.6% 

3.7% 

1.0% 

2.3% 

2.2% 

25k 

1.3% 

1.6% 

2.5% 

2.6% 

1.2% 

3.3% 

3.1% 

1  Air 

lUk 

0.7% 

0.9% 

1.4% 

1.5% 

1.6% 

4.6% 

4.3% 

3k 

0.5% 

0.7% 

1.0% 

1.1% 

1.6% 

4.6% 

4.3% 

1  wo-part  model  using  log  transformation 

Groups  of  1500 

50k 

1.6% 

1.8% 

3.2% 

3.3% 

1.1% 

1.0% 

1.0% 

25k 

0.9% 

1.1% 

2.2% 

2.3% 

0.7% 

1.3% 

1.3% 

10k 

0.4% 

0.6% 

1.2% 

1.3% 

0.5% 

1.9% 

1.7% 

5k 

0.3% 

0.4% 

0.9% 

0.9% 

0.4% 

1.9% 

1.7% 

Four-part  model  using  log  transformation 

Groups  of  1500 

50k 

0.9% 

0.9% 

0.9% 

0.8% 

0.9% 

0.6% 

0.7% 

25k 

0.5% 

0.5% 

0.5% 

0.5% 

0.6% 

0.5% 

0.5% 

10k 

0.3% 

0.3% 

0.3% 

0.3% 

0.5% 

0.4% 

0.4% 

5k 

0.2% 

0.3% 

0.2% 

0.3% 

0.4% 

0.3% 

0.3% 

One-part  model  using  raw  dollars   Groups  of  500 

50k          2.5%          2.6%          2.3%          2.4%          2.1%     I  1.9%  1.9% 

25k          1.6%          1.6%          1.4%          1.4%          1.3%  1.2%  1.2% 

10k          1.0%     1     1.0%          0.8%          0.8%          0.8%  0.8%  0.8% 

5k          0.7%          0.7%          0.6%          0.6%          0.6%  0.6%  0.6% 


One-part  model  using  log  transformation  Groups  of  500 


50k 

3.2% 

3.4% 

4.6% 

4.6% 

2.5% 

3.8% 

3.8% 

25k 

2.1% 

2.3% 

3.2% 

3.2% 

2.0% 

4.0% 

3.9% 

10k 

1.2% 

1.4% 

1.9% 

1.9% 

2.0% 

4.8% 

4.7% 

5k 

0.9% 

1.0% 

1.3% 

1.2% 

2.1% 

5.1% 

4.9% 

Two-part  model  usinj 

»  log  transformation 

Groups  of  500 

50k 

2.8% 

2.9% 

4.3% 

4.3% 

2.3% 

2.4% 

2.4% 

25k 

1.7% 

1.9% 

2.9% 

2.9% 

1.4% 

2.0% 

2.0% 

10k 

1.0% 

1.1% 

1.7% 

1.7% 

0.9% 

2.3% 

2.2% 

5k 

0.8% 

0.8% 

1.2% 

1.1% 

0.8% 

2.4% 

2.3% 

Four-part  model  using  log  transformation 

Groups  of  500 

50  k 

2.3% 

2.3% 

2.2% 

2.2% 

2.1% 

1.9% 

1.9% 

25k 

1.4% 

1.5% 

1.3% 

1.3% 

1.3% 

1.2% 

1.2% 

10k 

0.9% 

0.9% 

0.8% 

0.8% 

0.8% 

0.8% 

0.8% 

5k 

0.7% 

0.7% 

0.6% 

0.6% 

0.7% 

0.7% 

0.7% 

HMO-B       Mean  Squared  Forecasting  Error  (no  copay) 


Stopfem 
level 

 ($) 

1  A  G{I)  ' 

:::::::x:::::::::-:::>:::::::::::>:::::::::::: 

■IKS 

ACG 

ADG(l) 

ADG{2) 

One-part  mode]  using  raw  dollars 

Groups  of  5000 

50k 

0.5% 

0.4% 

0.4% 

0.3% 

0.1% 

0.5% 

0.5% 

25k 

0.8% 

0.7% 

0.7% 

0.6% 

0.3% 

0.8% 

0.7% 

10k 

0.7% 

0.7% 

0.7% 

0.6% 

0.3% 

0.6% 

0.6% 

5k 

0.6% 

0.5% 

0.5% 

0.4% 

0.3% 

0.4% 

0.4% 

One-part  model  using  log  transformation  Groups  of  5000 


50k 

0.3% 

0.3% 

1.0% 

0.9% 

2.2% 

12.3% 

12.5% 

25k 

0.5% 

0.6% 

1.1% 

0.9% 

1.8% 

11.3% 

11.3% 

10k 

0.5% 

0.6% 

0.6% 

0.6% 

2.3% 

12.3% 

12.2% 

5k 

0.3% 

0.4% 

0.2% 

0.2% 

3.1% 

12.7%) 

12.6% 

Two-part  model  using  log  transformation 

Groups  of  5000 

50k 

0.4%) 

0.3% 

2.1% 

2.0% 

0.1% 

0.8% 

0.7% 

25k 

0.6% 

0.6% 

2.2% 

2.2% 

0.1% 

0.5% 

0.4% 

10k 

0.5% 

0.6% 

1.6% 

1.6% 

0.1% 

0.6% 

0.5% 

5k 

0.4% 

0.4% 

0.9% 

0.9% 

0.1% 

0.7% 

0.6% 

Four-part  model  using  log  transformation 

Groups  of  5000 

50k 

0.1% 

0.1% 

0.1% 

0.1% 

0.2% 

0.1% 

0.1% 

25k 

0.2% 

0.2% 

0.1% 

0.1% 

0.1% 

0.1% 

0.1% 

10k 

0.3% 

0.3% 

0.2% 

0.3% 

0.1% 

0.0% 

0.0% 

5k 

0.2% 

0.2% 

0.2% 

0.2% 

0.0% 

0.0% 

0.0% 

One-part  model  using  raw  dollars  Groups  of  30Q0 


50k 

0.6% 

0.6% 

0.5% 

0.5% 

0.3% 

0.7% 

0.7% 

25k 

1.0% 

0.9% 

0.8% 

0.8% 

0.5% 

0.9% 

0.9% 

10k 

0.9% 

0.9% 

0.8% 

0.7% 

0.5% 

0.8% 

0.7% 

5k 

0.7% 

0.6% 

0.6% 

0.5% 

0.4% 

0.5% 

0.5% 

One-part  model  using  log  transformation 

Groups  of  3000 

50k 

0.5% 

0.5% 

1.1% 

1.0% 

2.2% 

12.6% 

12.7% 

25k 

0.7% 

0.8% 

1.2% 

1.1% 

1.8% 

11.4% 

11.4% 

10k 

0.6% 

0.8% 

0.8% 

0.7% 

2.1% 

12.2% 

12.1% 

5k 

0.4% 

0.5% 

0.3% 

0.3% 

2.9% 

12.5% 

12.4% 

Two-part  model  using  log  transformation 

Groups  of  3000 

50k 

0.5% 

0.5% 

2.3% 

2.3% 

0.3% 

1.0% 

0.9% 

25k 

0.8% 

0.8% 

2.4% 

2.4% 

0.2% 

0.6% 

0.6% 

10k 

0.7% 

0.7% 

1.8% 

1.8% 

0.1% 

0.6% 

0.6% 

5k 

0.5% 

0.5% 

1.1% 

1.0% 

0.1% 

0.7% 

0.6% 

Four-part  model  using  log  transformation 

Groups  of  3000 

50k 

0.3% 

0.3% 

0.3% 

0.3% 

0.4% 

0.3% 

0.3% 

25k 

0.3%o 

0.3% 

0.2% 

0.2% 

0.2% 

0.2% 

0.2% 

10k 

0.4% 

0.4% 

0.3% 

0.4% 

0.1% 

0.1% 

0.1% 

5k 

0.3% 

0.3% 

0.2% 

0.3% 

0.1% 

0.1% 

0.1% 
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Mean  Squared  Forecasting  Error 

(no  copay) 

u  A  G<1 

) 

\   A.G(2)  j 

CHR(l)      CHR(2)  j  ACG 

— 

0)G{1> 

I  ADG(2) 

One-part  model  using  raw  doHars  Groups  of  1500 


50k 

1.0% 

0.9% 

0.9% 

0.8% 

0.7% 

1.2% 

1.1% 

25k 

1.1% 

1.1% 

1.1% 

1.0% 

0.7% 

1.2% 

1.2% 

10k 

1.0% 

0.9% 

0.9% 

0.8% 

0.6% 

0.9% 

0.9% 

5k 

23.7% 

23.7% 

23.7% 

23.7% 

23.7% 

23.7% 

23.7% 

One-part  model  using  log  transformation  Groups  of  1500 


50k 

0.9% 

0.9% 

1.5% 

1.4% 

2.7% 

11.4% 

11.6% 

25k 

1.0% 

1.1% 

1.5% 

1.4% 

2.2% 

10.5% 

10.6% 

10k 

0.8% 

0.9% 

1.0% 

0.9% 

2.4% 

11.2% 

11.2% 

5k 

0.5% 

0.6% 

0.4% 

0.4% 

3.3% 

12.1% 

12.0% 

Two-part  model  using  log  transformation 

Groups  of  1500 

50k 

0.9% 

0.9% 

2.6% 

2.6% 

0.7% 

1.2% 

1.2% 

25k 

1.0% 

1.0% 

2.6% 

2.6% 

0.6% 

0.9% 

0.8% 

10k 

0.8% 

0.8% 

1.9% 

1.9% 

0.4% 

0.8% 

0.7% 

5k 

0.5% 

0.5% 

1.0% 

1.0% 

0.3% 

0.9% 

0.8% 

Four-part  model  using  log  transformation  Groups  of  1500 


SOk 

0.7% 

0.6% 

0.6% 

0.6% 

0.8% 

0.6% 

0.7% 

25k 

0.5% 

0.5% 

0.5% 

0.5% 

0.6% 

0.5% 

0.5% 

10k 

0.5% 

0.5% 

0.5% 

0.5% 

0.3% 

0.3% 

0.3% 

5k 

0.4% 

0.4% 

0.3% 

0.3% 

0.2% 

0.2% 

0.2% 

One-part  model  using  raw  dollars    Groups  of  500 


50k 
25k 
10k 
5k 

1.7% 

1.7% 

1.8% 

1.8% 

1.9% 

2.4% 

2.5% 

1.8% 

1.8% 

1.6% 

1.6% 

1.6% 

2.0% 

2.1% 

1.5% 

1.4% 

1.2% 

1.1% 

1.1% 

1.3% 

1.3% 

1.1% 

1.1% 

0.9% 

0.8% 

0.8% 

0.9% 

0.9% 

One-part  model  using 

I  log  transformation 

Groups  of  500 

SOk 
25k 
10k 
5k 

1.8% 

1.7% 

2.2% 

2.2% 

5.3% 

17.3% 

18.0% 

1.8% 

1.6% 

1.9% 

1.9% 

4.1% 

14.9% 

15.5% 

1.5% 

1.3% 

1.2% 

1.2% 

3.9% 

14.7% 

15.1% 

1.2% 

1.0% 

0.8% 

0.8% 

4.4% 

14.4% 

14.7% 

Two-part  model  usinj 

;  log  transformation 

Groups  of  500 

50k 
25k 
10k 
5k 

1.7% 

1.6% 

2.9% 

2.9% 

2.4% 

3.2% 

3.2% 

1.6% 

1.6% 

2.8% 

2.8% 

1.8% 

2.3% 

2.3% 

1.3% 

1.2% 

1.9% 

1.8% 

1.2% 

1.7% 

1.7% 

1.0% 

0.9% 

1.2% 

1.1% 

1.0% 

1.5% 

1.5% 

Four-part  model  using  log  transformation 

Groups  of  500 

SOk 
25k 
10k 
5k 

1.5% 

1.4% 

1.5% 

1.5% 

2.2% 

2.0% 

2.1% 

1.4% 

1.5% 

1.3% 

1.3% 

1.3% 

1.2% 

1.2% 

1.1% 

1.1% 

0.9% 

0.9% 

1.0% 

0.8% 

0.8% 

0.9% 

0.8% 

0.7% 

0.7% 

0.7% 

0.5% 

0.5% 
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Stopkw 
level 


7$r 

50k 
25k 
10k 
5k 


HMO-A    Percent  of  Groups  within  5%  of  Target  (nocopay) 


A  G(2j 

I  CHR(!)  I  CHR(2)  j  ACG 

ADG(I)  ! 

ADG(2) 

One-part  model  usin 

I  raw  dollars 

Groups  of  5000 

30.0% 

23.3% 

48.3% 

46.7% 

58.3% 

63.3% 

61.7% 

51.7% 

51.7% 

70.0% 

73.3% 

80.0% 

83.3% 

81.7% 

80.0% 

80.0% 

93.3% 

93.3% 

90.0% 

96.7% 

96.7% 

78.3% 

78.3% 

90.0% 

93.3% 

98.3% 

98.3% 

98.3% 

One-part  model  asing  log  transformation 


Groups  of  5000 


50k 

5.0% 

3.3% 

0.0% 

0.0% 

70.0% 

20.0% 

23.3% 

25k 

10.0% 

8.3% 

0.0% 

0.0% 

38.3% 

3.3% 

5.0% 

10k 

48.3% 

20.0% 

1.7% 

1.7% 

10.0% 

0.0% 

0.0% 

5k 

60.0% 

40.0% 

3.3% 

3.3% 

3.3% 

0.0% 

0.0% 

Two-part  model  using  log  transformation 

Groups  of  5000 

50k 

8.3% 

8.3% 

0.0% 

0.0% 

46.7% 

63.3% 

66.7% 

25k 

26.7% 

18.3% 

0.0% 

0.0% 

70.0% 

31.7% 

38.3% 

10k 

66.7% 

60.0% 

1.7% 

1.7% 

81.7% 

6.7% 

8.3% 

5k 

85.0% 

71.7% 

6.7% 

6.7% 

80.0% 

1.7% 

3.3% 

Four-part  model  using  log  transformation 

Groups  of  5000 

50k 

45.0% 

43.3% 

48.3% 

50.0% 

61.7% 

71.7% 

68.3% 

25k 

81.7% 

76.7% 

85.0% 

83.3% 

81.7% 

88.3% 

85.0% 

10k 

90.0% 

86.7% 

96.7% 

96.7% 

81.7% 

80.0% 

80.0% 

5k 

93.3% 

86.7% 

98.3% 

93.3% 

88.3% 

90.0% 

93.3% 

One-part  model  using  raw  dollars 

Groups  of  3000 

50k 

28.3% 

26.7% 

41.7% 

36.7% 

40.0% 

46.7% 

46.7% 

25k 

50.0% 

45.0% 

60.0% 

61.7% 

68.3% 

65.0% 

66.7% 

10k 

70.0% 

68.3% 

83.3% 

83.3% 

81.7% 

80.0% 

80.0% 

5k 

75.0% 

73.3% 

86.7% 

86.7% 

88.3% 

90.0% 

90.0% 

One-part  model  using  log  transformation 

Groups  of  3000 

50  k 

3.3% 

1.7% 

0.0% 

0.0% 

65.0% 

28.3% 

38.3% 

25k 

11.7% 

6.7% 

0.0% 

0.0% 

55.0% 

10.0% 

11.7% 

10k 

35.0% 

25.0% 

3.3% 

3.3% 

23.3% 

1.7% 

1.7% 

5k 

53.3% 

38.3% 

11.7% 

8.3% 

3.3% 

1.7% 

1.7% 

Two-part  model  using  log  transformation 

Groups  of  3000 

50k 

11.7% 

5.0% 

0.0% 

0.0% 

25.0% 

65.0% 

61.7% 

25k 

25.0% 

21.7% 

0.0% 

0.0% 

56.7% 

46.7% 

53.3% 

10k 

61.7% 

46.7% 

5.0% 

5.0% 

71.7% 

11.7% 

15.0% 

5k 

76.7% 

60.0% 

18.3% 

16.7% 

71.7% 

5.0% 

10.0% 

Four-part  model  using  log  transformation 

Groups  of  3000 

50k 

38.3% 

38.3% 

43.3% 

43.3% 

41.7% 

55.0% 

55.0% 

25k 

65.0% 

61.7% 

61.7% 

60.0% 

75.0% 

76.7% 

83.3% 

10k 

78.3% 

85.0% 

85.0% 

83.3% 

75.0% 

71.7% 

73.3% 

5k 

86.7% 

85.0% 

93.3% 

88.3% 

80.0% 

78.3% 

80.0% 

HMO-A     Percent  of  Groups  within  5%  of  Target  (nocopay) 

fever    ;   A  G(l)  J  A  G(2)  \  CHR(1) :;  |  CHR(2)  j     ACG  i  ADG(I)  !  ADG(2) 


One-part  model  using  raw  dollars 


Groups  of  1500 
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50k 

28.3% 

30.0% 

36.7% 

38.3% 

41.7% 

36.7% 

41.7% 

25k 

38.3% 

38.3% 

45.0% 

48.3% 

51.7% 

55.0% 

60.0% 

10k 

56.7% 

53.3% 

68.3% 

71.7% 

70.0% 

68.3% 

68.3% 

5k 

51.7% 

55.0% 

71.7% 

71.7% 

73.3% 

68.3% 

66.7% 

One-part  model  using  log  transformation 


Groups  of  1500 


50k 

15.0% 

13.3% 

1.7% 

3.3% 

43.3% 

33.3% 

33.3% 

25k 

20.0% 

16.7% 

3.3% 

3.3% 

38.3% 

11.7% 

13.3% 

10k 

36.7% 

23.3% 

5.0% 

5.0% 

28.3% 

0.0% 

1.7% 

5k 

40.0% 

33.3% 

11.7% 

11.7% 

16.7% 

0.0% 

0.0% 

Two-part  model  using  log  transformation 

Groups  of  1500 

50k 

20.0% 

18.3% 

1.7% 

3.3% 

36.7% 

45.0% 

43.3% 

25k 

31.7% 

26.7% 

3.3% 

5.0% 

48.3% 

36.7% 

41.7% 

10k 

43.3% 

41.7% 

6.7% 

5.0% 

65.0% 

13.3% 

20.0% 

5k 

56.7% 

53.3% 

20.0% 

13.3% 

60.0% 

10.0% 

11.7% 

Four-part  mode!  using  log  transformation 

Groups  of  1500 

50k 

35.0% 

31.7% 

36.7% 

38.3% 

41.7% 

46.7% 

50.0% 

25k 

45.0% 

48.3% 

45.0% 

50.0% 

53.3% 

58.3% 

63.3% 

10k 

70.0% 

71.7% 

70.0% 

75.0% 

65.0% 

68.3% 

65.0% 

5k 

71.7% 

68.3% 

73.3% 

73.3% 

68.3% 

71.7% 

75.0% 

One-part  model  using  raw  dollars 

Groups  of  500 

SOk 

15.0% 

15.0% 

18.3% 

21.7% 

30.0% 

25.0% 

25.0% 

25k 

21.7% 

25.0% 

23.3% 

28.3% 

35.0% 

40.0% 

41.7% 

10k 

38.3% 

40.0% 

43.3% 

43.3% 

51.7% 

43.3% 

45.0% 

5k 

40.0% 

45.0% 

45.0% 

48.3% 

46.7% 

53.3% 

51.7% 

Groups  of  500 


50k 
25k 
10k 
5k 

11.7% 

13.3% 

11.7% 

13.3% 

23.3% 

28.3% 

25.0% 

13.3% 

13.3% 

6.7% 

8.3% 

30.0% 

28.3% 

26.7% 

31.7% 

28.3% 

11.7% 

11.7% 

33.3% 

8.3% 

11.7% 

33.3% 

28.3% 

20.0% 

21.7% 

31.7% 

11.7% 

6.7% 

Two-part  model  usinj 

>  log  transformation 

Groups  of  500 

50k 
25k 
10k 
5k 

11.7% 

13.3% 

13.3% 

13.3% 

21.7% 

26.7% 

26.7% 

16.7% 

20.0% 

5.0% 

8.3% 

35.0% 

36.7% 

33.3% 

33.3% 

31.7% 

15.0% 

16.7% 

38.3% 

36.7% 

33.3% 

40.0% 

38.3% 

25.0% 

20.0% 

41.7% 

31.7% 

31.7% 

Four-part  model  using  log  transformation 

Groups  of  500 

SOk 
25k 
10k 
5k 

16.7% 

20.0% 

21.7% 

23.3% 

31.7% 

23.3% 

26.7% 

25.0% 

30.0% 

28.3% 

33.3% 

40.0% 

41.7% 

36.7% 

45.0% 

46.7% 

43.3% 

43.3% 

48.3% 

48.3% 

45.0% 

50.0% 

50.0% 

53.3% 

50.0% 

41.7% 

46.7% 

43.3% 
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HMO-B     Percent  of  Groups  within  5%  of  Target     (no  copay) 

Stopkm 

level'    j]  A  GQ)   I  A  G(2)  1  CHRQ)  [  CHR(2>  1     ACG     |  ADC(l)  j  ADG(2) 

($3  — 


One-part  model  using  raw  dollars  Groups  of  5000 


50k 

28.3% 

36.7% 

38.3% 

45.0% 

91.7% 

28.3% 

31.7% 

25k 

10.0% 

13.3% 

16.7% 

16.7% 

55.0% 

6.7% 

8.3% 

10k 

11.7% 

15.0% 

15.0% 

18.3% 

46.7% 

6.7% 

11.7% 

5k 

21.7% 

21.7% 

23.3% 

25.0% 

60.0% 

21.7% 

25.0% 

One-part  model  using  log  transformation 

Groups  of  5000 

f  ski 

50k 

51.7% 

51.7% 

6.7% 

6.7% 

0.0% 

0.0% 

0.0% 

25k 

28.3% 

21.7% 

6.7% 

6.7% 

0.0% 

0.0% 

0.0% 

10k 

28.3% 

18.3% 

15.0% 

23.3% 

0.0% 

0.0% 

0.0% 

5k 

51.7% 

36.7% 

58.3% 

73.3% 

0.0% 

0.0% 

0.0% 

Two-part  model  using  log  transformation 

Groups  of  5000 

50k 

40.0% 

46.7% 

0.0% 

0.0% 

81.7% 

13.3% 

18.3% 

25k 

23.3% 

20.0% 

0.0% 

0.0% 

91.7% 

33.3% 

36.7% 

10k 

21.7% 

20.0% 

0.0% 

0.0% 

95.0% 

15.0% 

23.3% 

5k 

38.3% 

36.7% 

5.0% 

6.7% 

88.3% 

6.7% 

10.0% 

Four-part  model  using  log  transformation 

Groups  of  5000 

50k 

78.3% 

88.3% 

81.7% 

86.7% 

68.3% 

88.3% 

86.7% 

25k 

81.7% 

76.7% 

85.0% 

83.3% 

81.7% 

88.3% 

85.0% 

10k 

51.7% 

55.0% 

56.7% 

50.0% 

100.0% 

100.0% 

98.3% 

5k 

68.3% 

66.7% 

76.7% 

71.7% 

98.3% 

100.0% 

100.0% 

One-part  model  using  raw  dollars   Groups  of  3000 


50k 

28.3% 

33.3% 

30.0% 

35.0% 

56.7% 

26.7% 

31.7% 

25k 

20.0% 

21.7% 

21.7% 

21.7% 

45.0% 

13.3% 

16.7% 

10k 

15.0% 

16.7% 

16.7% 

20.0% 

33.3% 

15.0% 

15.0% 

5k 

21.7% 

25.0% 

26.7% 

26.7% 

48.3% 

25.0% 

26.7% 

One-part  model  using  log  transformation    Groups  of  3000 


50k 

38.3% 

33.3% 

13.3% 

16.7% 

3.3% 

0.0% 

0.0% 

25k 

30.0% 

25.0% 

10.0% 

11.7% 

6.7% 

0.0% 

0.0% 

10k 

30.0% 

23.3% 

20.0% 

20.0% 

0.0% 

0.0% 

0.0% 

5k 

43.3% 

33.3% 

51.7% 

58.3% 

0.0% 

0.0% 

0.0% 

Two-part  model  using  log  transformation 

Groups  of  3000 

50k 

33.3% 

30.0% 

3.3% 

3.3% 

73.3% 

33.3% 

36.7% 

25k 

23.3% 

23.3% 

3.3% 

3.3% 

71.7% 

45.0% 

48.3% 

10k 

25.0% 

25.0% 

1.7% 

3.3% 

85.0% 

35.0% 

38.3% 

5k 

38.3% 

36.7% 

6.7% 

6.7% 

88.3% 

23.3% 

26.7% 

Four-part  model  using  log  transformation 

Groups  of  3000 

50k 

65.0% 

63.3% 

61.7% 

65.0% 

58.3% 

68.3% 

68.3% 

25k 

65.0% 

61.7% 

61.7% 

60.0% 

75.0% 

76.7% 

83.3% 

10k 

46.7% 

43.3% 

48.3% 

45.0% 

83.3% 

80.0% 

78.3% 

5k 

53.3% 

51.7% 

66.7% 

56.7% 

91.7% 

88.3% 

88.3% 

HMO-B     Percent  of  Groups  within  5%  of  Target     (no  copay) 


level  i 

(%) 

f  A  G(l)  T'a  G(2)'"] 
One-part  model  usinf 

CHR(1) 
;  raw  dollars 

CHR(2)  | 

ACG 

ADG(l)  j  ADG(2) 
Groups  of  1500 

50k 
25k 
10k 
5k 

33.3% 

41.7% 

36.7% 

36.7% 

45.0% 

38.3% 

38.3% 

26.7% 

30.0% 

26.7% 

31.7% 

45.0% 

30.0% 

33.3% 

25.0% 

25.0% 

28.3% 

31.7% 

45.0% 

26.7% 

25.0% 

41.7% 

41.7% 

45.0% 

45.0% 

41.7% 

50.0% 

50.0% 

One-part  model  usin< 

I  k»g 

transformation 

Groups  of  1500 

50k 
25k 
10k 
5k 

33.3% 

33.3% 

25.0% 

26.7% 

20.0% 

0.0% 

0.0% 

28.3% 

28.3% 

23.3% 

26.7% 

21.7% 

0.0% 

0.0% 

30.0% 

33.3% 

28.3% 

35.0% 

8.3% 

0.0% 

0.0% 

43.3% 

38.3% 

48.3% 

53.3% 

1.7% 

0.0% 

0.0% 

Two-part  model  using  log  transformation 

Groups  of  1500 

50k 
25k 
10k 
5k 

33.3% 

38.3% 

11.7% 

11.7% 

40.0% 

35.0% 

36.7% 

30.0% 

26.7% 

8.3% 

10.0% 

41.7% 

36.7% 

41.7% 

31.7% 

28.3% 

6.7% 

6.7% 

53.3% 

38.3% 

41.7% 

46.7% 

43.3% 

20.0% 

21.7% 

58.3% 

28.3% 

31.7% 

Four-part  model  using  log  transformation 

Groups  of  1500 

50k 
25k 
10k 
5k 

45.0% 

41.7% 

45.0% 

46.7% 

38.3% 

40.0% 

40.0% 

45.0% 

48.3% 

45.0% 

50.0% 

53.3% 

58.3% 

63.3% 

43.3% 

46.7% 

50.0% 

48.3% 

55.0% 

56.7% 

58.3% 

56.7% 

58.3% 

58.3% 

60.0% 

70.0% 

68.3% 

68.3% 

One-part  model  using  raw  dollars  Groups  of  500 


50k 

31.7% 

33.3% 

35.0% 

36.7% 

28.3% 

28.3% 

26.7% 

25k 

28.3% 

35.0% 

31.7% 

36.7% 

26.7% 

28.3% 

25.0% 

10k 

36.7% 

38.3% 

38.3% 

40.0% 

28.3% 

31.7% 

33.3% 

5k 

31.7% 

31.7% 

40.0% 

41.7% 

31.7% 

33.3% 

36.7% 

One-part  model  using  log  transformation  Groups  of  500 

50k         26.7%     I    31.7%        28.3%        26.7%  20.0%  10.0%  10.0% 

25k         25.0%        25.0%        23.3%        28.3%  20.0%  8.3%  6.7% 

10k         31.7%        38.3%        35.0%        33.3%  15.0%  0.0%  0.0% 

5k          36.7%        38.3%        41.7%        48.3%  8.3%  0.0%  0.0% 


Two-part  model  using  log  transformation  Groups  of  500 


50k 

26.7% 

23.3% 

13.3% 

13.3% 

25.0% 

20.0% 

20.0% 

25k 

26.7% 

26.7% 

18.3% 

16.7% 

26.7% 

21.7% 

23.3% 

10k 

35.0% 

38.3% 

23.3% 

23.3% 

40.0% 

23.3% 

23.3% 

5k 

36.7% 

43.3% 

31.7% 

30.0% 

45.0% 

33.3% 

35.0% 

Four-part  model  using  log  transformation 

Groups  of  500 

50k 

25.0% 

25.0% 

28.3% 

23.3% 

26.7% 

26.7% 

28.3% 

25k 

25.0% 

30.0% 

28.3% 

33.3% 

40.0% 

41.7% 

36.7% 

10k 

35.0% 

40.0% 

40.0% 

41.7% 

40.0% 

41.7% 

43.3% 

5k 

41.7% 

45.0% 

50.0% 

50.0% 

46.7% 

46.7% 

46.7% 
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